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A Re-examination of the Accident Proneness Concept 


Alexander Mintz and Milton L. Blum 
College of the City of New York 


It is generally accepted that certain individuals consistently have 
many accidents while others do not. This is‘commonly known as the 
principle of accident proneness. A critical examination of the data 
reported in the literature points to the“desirability of reconsidering the 
significance attached to the principle of accident proneness. 

This article has two objectives: (1) To indicate that one of the methods 
to substantiate the principle of accident proneness is unsound and to 
show that its use has led in some instances to exaggerated views of dif- 
ferences in accident proneness; and (2) To propose a method whereby 
quantitative estimates of differences in accident liability ' may be ob- 
tained and to point out the conditions when it may be used. 

The statistical evidence for the principle of accident proneness was 
presented by Greenwood and Woods (6) in 1919. These authors com- 
pared the distribution of accidents in a given population with a simple 
chance distribution for the same number of accidents in a population of 
the same size. Evidence of differences in accident proneness was ob- 
tained: It was discovered that more people had no accidents than might 
have been expected “by chance.’”’ Conversely, it was discovered that more 
people had many accidents than would have been expected in accordance 
with a simple chance distribution. In other words, Greenwood and 
Woods demonstrated that the obtained distributions of accidents differed 
significantly from chance expectancy. Furthermore, they showed that 
most of their distributions agreed with theoretically computed distribu- 
tions based on the assumption that people differed from each other in their 
likelihood to have accidents. 

Newbold (9) further investigated this problem and pointed out that 
the differences in accident liabilities could not be entirely explained simply 


1In the subsequent discussions we shall use the expression “accident proneness’’ in 
referring to personal characteristics of people contributing to the likelihood of their 
having accidents. The expression “‘accident liability” will refer to both personal char- 
acteristics and stable environmental conditions contributing to accidents records. 
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in terms of different job hazards. In addition, Newbold, in some of her 
work, compared the accident rates for the same people in two successive 
periods and reported that significant correlations existed. 

Both Greenwood and Woods, and Newbold were primarily interested 
in the establishment of the existence of a difference between accident 
records and chance expectancy. In this they were successful and ac- 
cordingly the principle of accident proneness was established. 

However, another method has been used to support the principle of 
accident proneness. A number of investigators and writers of books on 
industrial psychology have pointed out that small percentages of people 
have large percentages of accidents and have presented data accordingly. 
In this method the obtained accident distribution is presented as evidence 
for the principle of accident proneness without a comparison to the dis- 
tribution that would be normally expected ‘“‘by chance,” i.e., if all indi- 
viduals were equally liable to accidents. This method is fallacious. 


The Method of Percentages 


The method of percentages: of people and accidents implies an in- 
correct assumption, viz., that chance expectation requires that all people 
in a population should have the same number of accidents. This is not 
the case. An obvious limitation that has often been overlooked is the 
fact that very often the reported total number of accidents in a popula- 
tion is smaller than the number of people in the population. For example, 
if a group of one hundred factory workers had fifty accidents in one 
year, then a maximum of fifty people could have contributed to the 
accident record and accordingly a maximum of 50% of the population 
would have contributed to 100% of the accidents. Obviously a small 
percentage of the population in this case does not establish the principle of 
accident proneness. However, the number of employees having accidents 
is almost certain to be less than fifty since there is no reason to believe 
that each one should have had only one accident. Such an assumption 
would imply that an accident immunizes its victim against further acci- 
dents. If one makes the assumption of equal liability, the people who 
had one accident should be just as liable to have future accidents as 
those who have not had any. Thus if accident liability is unchanged 
by accidents already had, some people should have two accidents before 
others have had any. In fact, in accordance with chance expectancy 
some people should have had three or more accidents before another had 
a single accident. In dealing a deck of cards it is not improbable that a 
person will receive more or less than the three or four cards in a suit 
that seem to be his share. He may get six, seven or more such cards 
without any laws of probability being violated. Similarly, a person 
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may have more accidents than seems to be his share in a given population 
without being more accident prone than the average. 

Thus the assumption of equal accident liability results in different 
accident totals for the individuals within the group. The resulting 
distribution can be readily derived from the statement, ‘‘the current rate 
at which accidents occur per person is identical in groups of people with 
different numbers of accidents in the past.’ It follows directly from 
this statement that as the number of people who have had no accidents 
decreases, fewer people are likely to have first accidents per unit of time; 
as the number of people who have had first accidents increases, the rate 
of occurrence of second accidents increases proportionately. These and 
other similar statements can be reformulated as a set of differential equa- 
tions, and the solution of this set of equations gives the terms of the 
Poisson distribution. Greenwood and Yule (7) first demonstrated its 
applicability to the accident problem. The Poisson is a discrete dis- 
tribution rather than a continuous one. As applied to the accident 
problem, its consecutive terms give the predicted numbers of people 


who had no accidents, one accident, two accidents, etc. The terms are 
2 3 
1 


P . , mM’ ~~ 
Ne-™, Ne~™m, N-e™ BI? Ne-™ 31? ete., where N is the number of people, 


e is the constant 2.71828---, m is the mean number of accidents per 
person. 

A number of sets of data will now be discussed in order to illustrate 
the inadequacy of the method of percentages of people and accidents. 

Based upon original records obtained by the authors from a foundry 
it was found that 1.8% of the 280 men in the day shift had 11.4% of the 
accidents; 10° of the men had 44.3% of the accidents. In the night 
shift 5.8% of the 120 men had 12.5% of the accidents and 37.5% of the 
men had all of the accidents. A computation of the distribution of 
accidents in accordance with chance expectancy (equal liability distribu- 
tion) indicated that the differences between the obtained and expected 
distributions were not significant. In accordance with the theoretical 
distribution, 1.4% of the people should have had 8.3% of the accidents 
and 8.9% of the people should have had 38.8% of the accidents. These 
percentages obtained from a theoretically computed equal liability dis- 
tribution show that the eccident distribution actually obtained is in ac- 
cordance with chance expectancy and does not establish the existence of 
accident proneness. 

A study that is often referred to in discussions of accident proneness 
is that of the National Association of Taxicab Owners and the Metro- 
politan Life Insurance Company (11). These data deal with the records 
of 1294 drivers employed by several taxicab companies. Viteles (13) 
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states that “the incidence of accident proneness in the operation of motor 
vehicles has been well demonstrated in this study.” “It is interesting to 
note that the data obtained in accident prone studies in other types of 
industries if plotted would closely conform to the curve shown. . . .” 

Neither the authors of the report nor the author of the textbook 
compared the data with the simple chance distribution. Such computa- 
tions have been made and are presented in Figure 1. 
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Fic. 1. Relationship between cumulative percentage of taxi drivers and of accidents. 


The solid line in the figure represents the cumulative percentages 
of accidents corresponding to cumulative percentages of drivers, based 
on the data as quoted in the original report. The dotted line represents 
the corresponding cumulative percentages from an equal liability dis- 
tribution. 

The two lines are obviously very similar in shape. The argument (13) 
could be repeated verbatim with percentages from the chance distribution 
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substituted for obtained percentages, with very little loss in apparent 
persuasiveness. In the chance distribution, 23.5% of the people would 
have had no accidents instead of the obtained 25.2%. The best and the 
worst 50% would have had 18.3% and 81.7% of the accidents respec- 
tively, instead of the actually obtained 17.2% and 82.8%. The worst 
third of the drivers would have had 63.9% of the accidents (instead of 
69.3%); the worst 10% would have had 24.7% instead of 31.9% of the 
accidents. 

In spite of the fact that the two distributions are very similar in shape, 
the difference between" them is statistically significant, the chi square 
being 122.77 (d.f. = 6,,P < .0001). In other words, factors other than 
so called chance factors are definitely present but do not markedly change 
the general shape of the chance distribution. 

Another often referred to study on accident proneness is the one re- 
ported by Slocombe and Brakeman (12). Their data are based upon 
accident records of 2300 men employed by the Boston Elevated Railway 
Company. 

In discussing their data as indicative of differences in accident prone- 
ness, Slocombe and Brakeman classified the men with four or less acci- 
dents as “low accident men” and those with five or more accidents as 
“high accident men.” This arbitrary division placed 1828 men in the 
first category and 472 men in the latter division. The “low accident” 
men averaged 2.1 accidents while the “high accident”? men averaged 7 
accidents. Slocombe and Brakeman did not compute the chance ex- 
pectancy of the number of men having four accidents or less. Actually, 
in a simple chance distribution, 1824 men should be expected to be in 
this category and so only four more men of the total 2300 are in the “low 
accident” group than obtained by chance. According to chance ex- 
pectancy, the “low accident” and “high accident”? men should have 
averaged 2.4 accidents per man and 5.8 accidents per man respectively. 
The difference is not much smaller than the one actually obtained. This 
does not mean that there is no evidence for differences in accident 
proneness in the data. It merely means that Slocombe and Brakeman’s 
line of argumentation is inconclusive. 

More recent data based upon a random sample of licensed drivers 
in the state of Connecticut (2) have been analyzed by Cobb (1). He 
computed the amount by which the variance of accident records exceeds 
the variance of the Poisson distribution and thus determined that these 
accidents records cannot correlate with a perfect test of accident prone- 
ness to a degree higher than +.44. 

DeSilva (2) refers to these data and uses as argument for the principle 
of accident proneness mainly the fact that four per cent of the drivers 
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were responsible for 36% of the accidents. In a simple chance distribu- 
tion 2.4% of the drivers would be responsible for 21.2% of the accidents. 
Again a comparison of percentages of people and of accidents in incon- 
clusive. The figures just quoted based on the assumption of a simple 
chance distribution look almost as impressive as the figures in the actually 
obtained distribution.* 


Quantitative Estimate of Differences in Accident. Liability 


It is possible to arrive at an estimate of the magnitude of differences 
in accident liability (as distinguished from differences in accident records) 
in the case of many populations. The procedure has been previously 
used by Cobb (1) as a step in estimating the maximum correlation between 
accident records and psychological tests. This procedure can be used in 
many instances to estimate the magnitude of differences in accident 
liability, but it is also necessary to mention that this procedure is not 
universally applicable. 

The presence of differences in accident liability of individuals in a 
population results in a composite of Poisson distributions of the accident 
records. The reason for this is as follows: Each particular degree of 
accident liability present in a population should result in a Poisson dis- 
tribution of the accident records. When two or more degrees of accident 
liability are present the resulting distribution is the sum of the two or 
more corresponding Poisson distributions. If the distribution of accident 
liability is a continuous function the resulting probability function of 
accidents is a composite of Poisson distributions which can be deter- 
mined by integration. 

When a given distribution of accident records is found to conform 
closely to a composite of Poisson distributions the evidence is consistent 
with the assumption that the differences between the accident records 
of different people are due partly to differences in their accident liability 
and partly to “chance” factors not predictable in terms of knowledge of 
the people or of their accident records. In this assumption, the “‘chance”’ 
factors produce the variability within the constituent Poisson distribu- 
tions while the differences in accident liability are responsible for the 
differences between their means. In accordance with such an assump- 
tion, one may analyze the obtained variance of a set of accident records 

2 Tables 1 (Foundry Data), 2 (Taxicab), 3 (Street Car Drivers), 4 (Auto Drivers), 
7 (Newbold’s Data), and 9 (Conn. car drivers) have been deposited with the American 
Documentation Institute to reduce printing costs. For these six pages of tables order 
Document 2633 from American Documentation Institute, 1719 N Street, N.W., Wash- 


ington 6, D. C., remitting $0.50 for microfilm (images 1 inch high on standard 35 mm. 
motion picture film) or $0.60 for photocopies (6 x 8 inches) readable without optical aid. 
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into two constituent variances and view one of them as representing the 
operation of the “‘chance’’ factors, the other as characterizing the differ- 
ences in accident liability. The former is the weighted arithmetic 
average of the variances of the Poisson distributions. As Cobb has 
shown, its value can be readily estimated as equal to the mean number 
of accidents per person.’ Thus the residual variance representing the 
operation of differences in accident liability may be estimated if one sub- 
tracts the mean number of accidents per person from the obtained 
variance of accident records. We have performed this computation for 
a considerable number of accident distributions and have expressed the 
resulting variances attributable to unequal accident liabilities as per- 
centages of the corresponding total variances of accident records. 

The argument of the last paragraph pre-supposes that the obtained 
accident distribution approximates a composite Poisson distribution. 
Theoretically, an infinite variety of such distributions could be computed, 
depending on the assumed form of the distribution of the means of the 
Poisson distributions. Actually only one kind of such composites seems to 
have been used in research, viz., Greenwood and Yule’s (7) “unequal 
liability distribution” (‘‘UD’’). This distribution is based on the as- 
sumption that accident liability of people is distributed along a Pearson 
Type III curve, a continuous skewed unimodal curve. Its equation may 
be found in several sources, e.g. (3), (8). Many sets of accident data can 
be actually approximated by composite Poisson distributions based on 
such assumed distributions of accident liability. It should be noted 
however, that Greenwood and Yule’s ‘“‘UD” distribution is by no means 
the only possible unequal liability (composite Poisson) distribution. 
Greenwood and Yule (7) report a set of equations for a different type 
of composite Poisson distribution, based on the assumption that accident 
liability is normally distributed. This distribution does not seem to have 
been used in research. The possibility should not be overlooked that 
this distribution or still another composite Poisson distribution, based 
on some other assumed distribution of accident liability, might prove 
to be useful in research. In this paper, composite Poisson distributions 
based on the Pearson III curve were used most of the time. In a few 
instances another possibility was explored to some extent; some sets of 
data suggested discontinuous distributions of accident liability, the dis- 
continuity being due to the presence of small numbers of deviant indi- 
viduals. On the other hand, the presented analysis of the sample 
variance into two components is not legitimate if the obtained distribu- 
tion deviates significantly from any composite Poisson distribution. 

3 This follows from the fact that in a simple Poisson distribution the variance is 


always equal to the mean. Hence, in a composite of such distributions, the mean of 
the variances is equal to the mean of the means. 
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The line of reasoning just developed will now be applied to the more 
widely known studies of accident proneness. 

The Greenwood and Woods study (6) presents fourteen sets of data. 
The majority of their findings agree rather well with the composite 
Poisson distributions computed according to Greenwood and Yule (7). 
In other words, the obtained figures are in accord with the assumptions: 


1. Accident proneness varies from person to person and its distri- 
bution is represented by a unimodal continuous skewed curve known 
as Pearson type III. 

2. Accident proneness of a person is unaltered by accidents he 
may have. 


Twelve of the fourteen sets of data do not differ significantly from 
the corresponding theoretically computed figures. The P’s reported by 
Greenwood and Woods obtained from the chi square technique range 
from 0.15 to 0.93.4 The sum of the chi squares for these 12 sets of data, 
based upon our computations, is 35.33, which for 30 degrees of freedom 
results in a P equal to about .23. The two deviant distributions will be 
discussed later. 

Thus it is possible to approximate closely the majority of Greenwood 
and Woods’ tables by theoretically computed distributions based on the 
assumptions that accident proneness is constant for each person and 
distributed in different people in accordance with a Pearson III curve. 
This finding is one of the principal ones in favor of the existence of dif- 
ferences in accident proneness. 

How large then are these differences in accident proneness if we take 
the findings at their face value and assume that variations in ‘‘chance”’ 
and differences in accident proneness are the only factors accounting for 
these distributions of accident records. Table 5 presents the data per- 
taining to the relative size of these differences in the Greenwood and woods 
study. 

For each one of Greenwood and Woods’ tables the estimated per- 
centage of the variance of accident records attributable to differences in 
accident liability is given. As stated on a preceding page, the estimated 
variance of accident liability is the difference between the obtained 
variance of accident records and the mean number of accidents. Di- 
viding this difference by the variance of accident records we obtain the 
percentage of the variance attributable to differences in accident lia- 
bility. In addition, the following data are also given: the number of 
cases, the mean and the variance of accident records. 

‘ The computations do not appear to be accurate in all cases. It is to be noted that 


the paper appeared in 1919 prior to Fisher’s pointing out the procedure for determining 
degrees of freedom. 
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Table 5 


Percentages of Variance Attributable to Differences in Accident Liability, 
from Greenwood and Woods Original Data 








Greenwood Number Obtained 
and Woods of Variance m —m x 100 
Table No. Cases Mean (m) (m,’) m,’ 





I (A) 750 0.576 0.540 — 

I (B) 580 0.478 0.491 — 
II (A) 647 0.465 0.691 32.7% 
II (B) 584 0.433 0.521 16.9% 
Ill 100 3.040 6.938 56.2% 
IV 414 0.483 1.008 52.1% 
¥ 201 0.473 0.508 7.0% 
VI 198 1.318 1.873 29.6% 
VII (A) 59 0.983 1.203 19.3% 
VII (B) 136 0.794 0.928 14.4% 
VIII (A) 50 2.800 6.720 58.3% 
VIII (B) 50 1.920 3.313 42.1% 
IX (A) 55 2.473 3.704 33.2% 
IX (B) 61 0.705 0.897 21.4% 





The median percentage of the total variance, attributable to differ- 
ences in accident liability is 31.15. The percentages range from 7% to 
58.3%. In nine of the twelve cases the percentage is less than 50. 
These figures hardly correspond to the impressions one is likely to derive 
from textbook accounts. The share of differences in accident liability 
in the variance of accident records is very variable; it exceeds 30% in 
only half of the cases while the rest of the variance which is more than 
twice as large must be attributed to unpredicatble “chance” factors. 

Newbold (9) collected a large number of sets of data from a number 
of factories. The factories were chosen on the basis of uniformity of 
the work performed, completeness of accident recording and opportunities 
for many minor accidents. The large majority of the accidents were 
trivial in nature, the author stating that the serious injuries were too 
few for correlational work. The findings differ in some respects from 
those of Greenwood and Woods. 

A large variety of results can be found in Newbold’s material. Never- 


~—™m 
theless, in general the ratio ——— X 100 tends to be considerably 


mo 


larger than in the data of Greenwood and Woods. It also tends to be 
larger than in the other studies we have examined. This difference be- 
tween Newbold’s data and those of the other investigators is due in 
part to the fact that the mean numbers of accidents per person are rather 
large as compared to those of most of the other distributions. The irregu- 
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larly variable factors should become relatively less and less important in 
the long run. Still, this is not the whole explanation. The ratios com- 
puted for Newbold’s material remain large even when compared to ratios 
from distributions with similar means. Table 6 presents these ratios 
as computed from the statistics given in Newbold’s paper; the number 
of cases and mean numbers of accidents as given by Newbold and the 
variances (squares of Newbold’s standard deviations) are also given. 
The figures may be compared to the corresponding ones in Table 5. 

The median percentages are 71.6 and 56.05 for the men and women 
respectively. The range is very great, the largest figure being 90% 
while at the other extreme there is an obtained variance which is actually 
slightly smaller than that of the corresponding Poisson distribution; 
this distribution closely approximates a simple chance distribution. 
These percentages do not accurately represent the share of differences in 
accident liability in the variance of accident records in all cases. In- 
spection of Newbold’s curves suggests that many of the obtained accident 
distributions deviate significantly from composite Poisson distributions. 
This matter was only partially investigated. The amount of work 
involved in the computation of composite Poisson distributions for thirty 
nine sets of data would have been prohibitive, particularly because these 
data are given by Newbold in the form of graphs rather than tables. 
Many of these graphs appear to have been inaccurately drawn, inasmuch 
as there are discrepancies between the totals of workers and accidents 
as read off from the graphs and as given in Newbold’s Table. 

Nevertheless, it can be shown that in some of Newbold’s sets of data 
composite Poisson distributions are appropriate and the percentage of 
the variance attributable to differences in accident liability is large. As 
an example, Table 7° presents the data from Newbold’s graph AIII, to- 
gether with the corresponding composite Poisson figures. The closeness 
of the fit is apparent. The accident liability share is 75.8%. 

Some of Newbold’s sets of data suggest that the distribution of 
accident liability was a discontinuous one; in these sets of data the great 
bulk of the cases fit either a simple or a composite Poisson distribution, 
but there are also a few deviant cases which lie outside of such distribu- 
tions. Most of the obtained variance of accident records due to accident 
liability may be due to the presence of these deviant cases; in other words, 
large deviations from the average accident liability appear only in a very 
small minority of cases. Thus Newbold’s set EIII is essentially a dis- 
tribution of the simple Poisson type, plus one markedly deviant worker. 
Set EV may be viewed as a distribution of the composite Poisson type 
(excess variance = 41%) plus 9 deviant workers. Table 8 presents 
these data. 


5 See footnote 2. 
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Table 6 











x 100 


Newbold’s Number Mean Variance mz’ — m 
Table No. of Cases (m) (m2) me’ 


EIlIl 226 18 59 68.0 
FI 22 me .20 — 
EIV 256 Al 69 40.5 
FI 81 43 45 4.2 
MIV 106 48 59 19.1 
EIl 281 1 .94 43.8 
BII 299 7 81 29.6 
I 190 .68 1.72 60.5 
g 50 1.04 1.99 48.7 
GI 47 1.47 3.76 60.9 
82 1.61 5.66 71.6 
BI 148 1.81 5.11 64.6 
218 1.95 7.13 72.6 
181 2.50 6.60 62.1 
304 2.56 10.56 75.8 
EVI 93 2.66 12.53 78.8 
EV 77 2.73 18.84 85.0 
N 284 2.90 23.33 87.6 
EI 440 3.64 13.76 73.1 
All 352 17.14 77.9 
MI 301 14.90 73.3 
MII 376 14.06 72.7 
MV 92 18.15 78.6 
EVII 57 5.6 56.25 90.0 
Al 204 41.86 84.4 
MI ‘ 53 30.6 
GIl 50 , 1.04 50.0 
GI 120 6 1.64 61.5 
MV 110 62 1.46 53.5 
I 161 : 1.21 42.1 
H 346 7S 1.35 41.3 
MIII 142 .06 te 9 40.1 
BI 145 : 2.04 48.2 
K 125 , 3.24 58.6 
DiI 98 3s 3.39 58.9 
BII 100 ; 5.57 61.9 
MII 161 2. 8.58 73.2 
C 58 : 7.88 68.7 
DI 28 3. 15.52 65.0 








The differences between Newbold’s findings and those of Greenwood 
and Woods, and of other investigators whose material is examined in this 
paper may possibly be attributed to the fact that her material consisted 
almost entirely of minor accidents. In spite of Newbold’s statement, 
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the reporting of accidents may not have been complete. It is difficult 
to ascertain the degree of completeness with which minor accidents were 
reported and there may have been individual differences in the reporting 
of accidents, producing the illusion of large differences in accident lia- 
bility. On the other hand, constant personal characteristics may play 
a more direct role in the causation of minor accidents than in that of 


Table 8 


Comparison of Two of Newbold’s Sets of Data with Theoretically 
Computed Distributions 
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* The discrepancy between the “actual” and the “equal liability” accident totals is 
due to a similar discrepancy between the totals as given in a table in Newbold’s paper, 
and as obtained from her curve. 


major accidents. Psychonanalysts generally believe that many accidents 
are unconscious self-injuries. It is possible that such unconscious self- 
injuries usually result in minor damage, just as in hysteria, in which minor 
self-injuries are common while major injuries are unusual. Minor acci- 
dents in industry may be often due to psychological mechanisms of the 
hysterical type.® 


6 This hypothesis was suggested to the writers by E. Emmons. 
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The distribution of the Connecticut licensed car drivers is essentially 
a composite Poisson distribution. A Greenwood and Yule “unequal 
liability” distribution fits the data rather well, except at the upper end. 
The results can be accounted for if one assumes that the distribution of 
accident liability deviates slightly from a Pearson III curve. The 
estimated portion of the variance of accident records attributable to 
differences in accident liability is 21.2%. Table 9 presents these data.’ 

There is corroborative information from other sources, indicating that 
differences in accident liability often account for only a relatively small 
portion of the yariance of accident records. The correlations between 
accident records in different periods of time reported by Newbold (9) 
and more recently by Ghiselli and Brown (5) are in most instances not 
high. Newbold’s correlations range from —.01 to +.71, with a median 
of +.36. Ghiselli and Brown’s correlations range from +.15 to +.80 
with a median of +.42; we omit the intercorrelations between different 
kinds of accidents presented in both papers which are considerably lower. 
Such correlations justify inferences which are similar to those we arrived 
at by the use of a different method. 

It should also be noted that the differences between automobile in- 
surance rates for people with different accident records are nonexistent. 
This practice is in conformity with our findings. The usual textbook 
discussions of accident proneness would suggest very different insurance 
rates for different accident records. 

When no composite Poisson distribution conforms to a set of accident 
data the suggested procedure is not applicable. The existence of factors 
must be assumed, which alter the shapes of the constituent Poisson 
curves. Changes in accident liability of people as a function of previous 
accidents encountered suggest a possible explanation of such results. We 
did not attempt to verify this possibility inasmuch as there seemed to be 
no way of arriving at a reasonably plausible hypothesis about the course 
of these changes in terms of information available at present. The only 
hypothesis suggested so far in the literature seems to have been the one 
implied in Greenwood and Yule’s “Biassed distribution,” and it is un- 
tenable theoretically and therefore unsuitable for research. This distri- 
bution is simply a Poisson distribution with a different first term. If 
there were no initial differences in accident liability, but the first accident 
changed the accident liability of its participants which would subse- 
quently remain constant, the resulting distribution would not be an in- 
complete Poisson distribution, because the one accident class would not 
grow as in the “simple chance’’ case. An incomplete Poisson distribution 
could be produced only by continuing changes in accident liability with 


7 See footnote 2. 
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successive accidents, and it would be a strange coincidence if these changes 
should be so graded as to produce a tail end of a Poisson distribution 
which has a completely different derivation. 

The distributions which deviate significantly from any composite 
Poisson distributions are two of Greenwood and Woods’ distributions 
(their Table 1A and 1B), the distributions of taxicab accidents and the 
distribution of street car accidents. Inspection of the data indicates 
that the obtained distributions are more leptokurtic than Poisson distri- 
butions, and compounding several of the latter can only flatten out the 
resulting shape. Several of Newbold’s distributions may be in the same 
category; they were not examined in detail for reasons stated earlier. 
The share of differences in accident liability in the total variance cannot 
be determined in such cases. The existence of other factors than differ- 
ences in accident liability and unpredictable ‘‘chance’’ factors must be 
assumed. 

Discussion 


It must be remembered that not all differences in accident liability 
are differences in accident proneness viewed as an individual character- 
istic. This point is not a new one; it has been made among others by 
Newbold and by Cobb. It is disregarded by investigators who combine 
data about street car accidents or taxi accidents from different cities. In 
factory work, different jobs differ in conditions of safety. In automobile 
or other vehicle driving, the safety conditions are not necessarily the 
the same from route to route, in city compared with city. The amount of 
mileage driven, necessary driving in adverse weather, etc., must contri- 
bute more opportunities for accidents and these are not functions of 
accident proneness defined as an individual trait. For example, only 
21.2% of the variance of the accident records of the Connecticut drivers 
was due to differences in constant accident rates. When one considers 
the hazards of driving just mentioned, it seems logical to state that there 
is not much room for differences in accident proneness as a psychological 
characteristic, insofar as these data are concerned. 

We have pointed out that in many instances the portion of the vari- 
ance of accident records attributable to differences in all forms of accident 
liability is relatively small as compared to the residual variance attribu- 
table to the operation of factors which are not predictable in terms of 
either the constant characteristics of people or of their previous accident 
records. These unpredictable or “chance” factors when operating alone 
give a so-called simple chance or equal liability or Poisson distribution. 
The expression “‘chance factors’? should not be misunderstood. They 
are not necessarily unpredictable in terms of changing features of the life 
situation. Thus a well known psychoanalyst spoke to one of the writers 
about a man he knew who had a temporary period of accident proneness 
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as a result of marital trouble, during which time he had several near- 
accidents in rapid succession. ‘‘Chance’’ refers only to lack of predic- 
tability in terms of constant characteristics of the individuai. 

There are many kinds of such “chance” factors. One kind does not 
seem to have received enough attention in the literature. Even when 
a person is clearly at fault in causing an accident, the accident might 
not have occurred if the circumstances had been different. One of the 
writers was once in a car driven by a man who did badly enough to have 
caused a very serious accident: the driver became frightened by a wasp 
on his leg and stopped looking at the road; shortly afterwards the car 
travelled into a ditch at the bottom of an embankment to the left of the 
highway. There had been no cars in the other traffic lane at the moment 
he crossed it, the embankment was not steep and there was no accident. 
About half a mile further there was a steep drop into a river on the left 
side. The expression “luck”’ seems to be quite appropriate here. 

As Cobb pointed out, the correlation between accident records and 
a perfect test of accident proneness need not be high. One cannot use 
any arbitrary criterion for classifying people as excessively accident- 
prone. For example, Poffenberger (10) states that ‘accident prone 
drivers are those who have two or three times as many accidents as the 
average driver . . . the term need not be restricted to auto accidents 

. . for it covers equally well accident repeaters in industry.” In 
many distributions examined here the number of accidents per person 
is one-half an accident or less. According to Poffenberger then, this 
would mean that persons with one or more accidents are to be considered 
as accident prone. This is obviously unfair. It is legitimate to select 
for study those people who have more than the average number of acci- 
dents but they should not be automatically classified as excessively 
accident prone without further evidence. Actually within a simple 
chance distribution some people are likely to have two to three times as 
many accidents as the average person. One can verify this by referring 
to the Poisson distributions in our tables. In most published distribu- 
tions only a very small minority have accident records which lie com- 
pletely above the point at which the Poisson distribution gives negligible 
values. As one approaches this point, one finds additional cases of more 
than average accident proneness, but some people with only average 
accident proneness who have had bad luck or temporary difficulties are 
also included in the group of people who have had many accidents. The 
problem of the exact estimation of the relative number of accident-prone 
individuals and bad luck individuals in any particular group of accident 
records is complicated. One should not attempt to make rough estimates 
without a comparison of obtained frequencies with the corresponding 
Poisson frequencies. 
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Summary 


1. A commonly used method of comparing percentages of men and of 
accidents proves nothing about the existence of differences in accident 
proneness. Examples proving the inconclusive nature of the method are 
cited. 

2. Comparison of obtained accident distributions with simple chance 
(Poisson) distributions establishes that there are differences in accident 
liability but does not indicate whether these differences are large or small 
and does not exclude the simultaneous operation of unpredictable 
“chance” factors. 

3. Different accident records do not necessarily represent different 
degrees of accident liability. A method for analysis of the variances 
of accident records of people into two component variances is suggested, 
one component attributable to differences in accident liability, the other 
to unpredictable ‘“‘chance factors.’’ It is pointed out that the method is 
only applicable when the obtained distribution resembles a composite 
of Poisson distributions. 

4. A number of published distributions of accidents are examined 
by the use of the above method. The variance attributable to differences 
in accident liability varies considerably. 


In the distributions which are examined in this paper and which do 
not involve primarily minor accidents,’ the variance attributable to 
differences in accident liability is in most cases between twenty and forty 
per cent of the total variance of accident records. Although differences 
in accident liability should not be overlooked as a factor in the different 
accident records of people, the effect of this factor is rather small as com- 
pared to the residual 60 to 80 per cent attributable to unpredictable 
factors. It is therefore apparent that in many instances personal accident 
proneness, which is but one of the components of accident liability, has 
been an overemphasized factor. 


Received November 2, 1948. 
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Method of Paired Comparisons and a Specification Scoring 
Key in the Evaluation of Jobs * 


G. A. Satter 


University of Michigan 


Within recent years, public and industrial employers have increasingly 
attempted to place their wage structures on objective bases. Among the 
techniques employed to this end are those which are commonly referred 
to as “job evaluation methods.” Collectively, these methods represent 
attempts to rate jobs in order to determine their relative worth with 
respect to other jobs and to use the job’s standing, within the group 
of which it is a member, as a basis for assigning a dollars-and-cents 
value to it. 

The most widely used methods fall into four general classes: (a) Those 
in which the operation of evaluation is one of comparing job against job 
in terms of the job’s overall worth (Ranking Method); (b) in which it is 
one of comparing job against job in terms of specific ‘elements’ or traits 
(Factor Comparison Method); (c) in which it is one of comparing the job 
against an arbitrarily defined scale of overall worth (Classification 
Method); and (d) in which it is one of comparing job against arbitrarily 
defined scales covering individual job traits or “elements” (Point Evalua- 
tion Method). 

From time to time, various authors have described alternatives to, or 
modifications of, the above basic methods but for the most part these 
methods have retained their popularity with surprisingly few modifica- 
tions. Thus, Viteles (10) and, more recently, Otis and Leukart (7) 
have recommended that the Method of Paired Comparisons be used as 
an alternate to the Ranking Method. So far as the present writer knows, 
no organization has ever given this recommendation a trial. Similarily, 
there are other scaling methods which might profitably be applied to the 
problem of jobs; on its face, the problem of scaling jobs does not seem to 
be pronouncedly different from that of scaling other subject matters. 
These alternative methods, too, have been neglected. 

The present report describes the results of applying two psychometrie 
techniques to the problem of building job scales in two industrial plants. 

* The writer expresses appreciation to Mr. A. J. Miller, Assistant Director of Indus, 
trial Relations, The Mead Corporation, for his advice and support on these projects- 


and to Hugh Black, C. Alvin Hoffman, and Robert Rock who assumed major responsi- 
bility for the collection and analysis of the data presented here. 
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One procedure involves the application of the Method of Paired Com- 
parisons and the other, the development of scoring keys which can be 
applied to job specifications. Both procedures are oriented toward de- 
veloping a scale the points of which are defined by jobs and which can 
be used in making the kinds of job measurements which are helpful in 
setting up wage schedules. 


I. Construction and Characteristics of Job Scales 
Built by the Method of Paired Comparisons 


The Jobs Studied. The investigations reported here were carried out on the 
clerical jobs of two comparatively large, Midwestern, paper mills. In one mill 
(Plant A), 70 jobs supplied the subject matter of study; in the other (Plant B), 
33 were studied. The group of 103 jobs covered a wide range of clerical skills; 
within the population were included Messenger and Mail Boy, the clerk classi- 
fications of accounting, purchasing, sales, billing, and scheduling departments, 
the specialized jobs associated with the operation of electric punched card 
equipment, and the supervisory jobs immediately associated with the jobs 
mentioned above. 

The Job Analysis. In both of the plants, preparatory to the scaling project, 
all of the jobs were subjected to intensive study. The study methods were 
modeled after those developed and used by the United States Employment 
Service (11). In each plant, trained job analysts, working down the organiza- 
tional chart, interviewed and observed to the organizational level of the jobs 
under study. The data collected thus represented the joint opinions of the 
employee performing the job, the supervisors immediately responsible for the 
job, the departmental head under whose jurisdiction the job fell, and the job 
analyst whose responsibilities were those of collecting, collating, and organizing 
the data into a formalized job description. The job descriptions were repro- 
duced in final form only after the employees and supervisors who supplied the 
original data were given an opportunity to review and then to endorse them. 
In both plants, the completed job descriptions ' were assembled in bound form, 
and in this form they served as the raw materials on which the judgments 
called for by the scaling operation were made. 

The Collection of the Scale Data. In both plants the judgments called for 
by the Method of Paired Comparisons were made by those persons within the 
organizational structure who were presumed to know the jobs in question best, 
namely, the personnel working at them and the supervisors immediately re- 
sponsible for them. In Plant A, thirteen judges (7 working on the jobs and 6 
supervising them) and in Plant B (5 and 5) constituted a “scaling committee.” 
The-members of these two committees were called together for an orientation 
meeting by their respective industrial relations departments. At these meet- 
ings, the objectives of the project were outlined, the procedures to be used were 
reviewed, and the members of the committees were given the materials which 
they were to use in arriving at and reporting their judgments. These materials 


1 These job descriptions contained considerably more detail than one conventionally 
finds in the descriptions prepared for a job evaluation project. The objective in each 
case was to provide the reader, even though he had little previous contact with the job, 
with enough detail to permit a judgment of the skills and knowledges which it required, 
the responsibilities which it entailed, and the conditions under which it was typically 
performed—in short, to arrive at judgments concerning those characteristics which are 
conventionally associated with job worth. 
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consisted of a bound volume of the job descriptions, which had been prepared 
earlier, and a set of forms on which their judgments were to be recorded. It 
was then possible for the members of the committee to proceed independently, 
and at their leisure, to make their judgments. It might be pointed out here, 
that the routines of the Method of Paired Comparisons are particularly well 
adapted for use in the industrial situation. Since comparatively naive judges 
can be introduced to the task called for by the method with a minimum of 
training, it is possible to work with large numbers of judges who can proceed 
independently under a minimum of supervision. 

By following this general procedure, the jobs in the two plants were scaled 
independently on four traits or ‘“‘elements” which a preliminary review of the 
literature of job evaluation indicated as being potentially most useful in dis- 
criminating between clerical jobs. For purposes of the evaluation, these traits 
were defined in the following manner: 


(a) Educational Skills. The degree to which the job demands preparatory 
skills (verbal, quantitative, etc.) which are most generally acquired in the 
schoolroom. 

(b) Work Skills. The degree to which the job demands specialized skills 
which can only be acquired either through job training or by extended experi- 
ence on the job. 

(c) Application Skills. The degree to which the job makes special demands 
on the individual worker; the degree to which the job is unpleasant, tiresome, 
monotonous, dirty, etc. 

(d) Social and Personal Skills. The degree to which the job requires 
ee skills—skill in supervising and in coordinating the activities 
of others. 


Thus, both groups of judges were required to make their inter-job com- 
arisons in four frames of reference. If the Method of Paired Comparisons 
ad been used in its traditional form, this would have meant that each judge 

in Plant A would have had to make (250) 4 judgments (9,660) and 
those in Plant B, 2,112. To reduce the number of pairs of jobs in Plant A to 
a more feasible number, a suggestion which Uhrbrock and Richardson (9) 
made earlier was followed. By using key jobs, against which all comparisons 
were made, and groups of ten jobs in which only the in-group comparisons 
were made, the total number of judgments made by each judge was reduced 
from 9,660 to 3,660. These job groups were set up in the following manner. 
The investigating staff, selected from the group of 70 jobs, ten which in their 
opinion seemed to fulfill the dual criterion of being generally well known and 
which collectively represented the entire range of abilities required by the 
seventy. These constituted the ‘“‘key group.” ? The sixty remaining jobs, 
then, were assigned to groups of ten in a random fashion. In preparing the 
worksheets for the judges, a scrambled order of pairs was used; each job title 
was presented first in half of the pairs; and the pairs involving the key jobs 
were interlaced throughout the whole list. No judge was informed that the 
key job device was being employed. In Plant B, the judges’ worksheets called 
for the complete set of 2,112 judgments. Apparently, as ‘we shall see later 
when the results from the two plants are compared, the modified procedure 
employed in Plant A did not distort the final results. Making and recording 
the judgments required from six to ten hours of the judge’s time. 


2 The key jobs were: Dark Room Technician, Junior Stenographer, Mail Boy, Pay- 
roll Clerk, Record Clerk, Scheduling Clerk, Secretarial Assistant, Statistical Supervisor, 
Stenographer, and Telephone Operator. 
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Computation of the Scale Values. The data from each group of judges were 
summarized and, following a “shortcut” procedure recommended by Guilford 
(2), the scale value equivalents of each job were computed. This procedure 
was employed in preference to that called for by Thurstone’s Case V of the 
Law of Comparative Judgment for the following reasons: (a) The small number 
of judges used hardly warranted the laborious operation of computing the 
several estimates of each scale separation which is required by the Thurstone 
procedure and (b) Guilford (2) has demonstrated empirically the compara- 
bility of scale values derived from using his abbreviated procedure and those 
derived from Case V procedure. 


Results 


The results of the operations described above were four skill scales 
which were presumed to be capable of measuring the dimensions on which 
wages for clerical jobs are commonly paid. At this point, the problem of 
combining the measurements yielded by these scales arises. If a suitable 
criterion is available, multiple correlational procedures are probably most 
appropriate. In a certain sense, the validation of job scales presents an 
even more difficult problem than is typically encountered in the validation 
of employee selection instruments; here, the problem is not only one of 
measuring the criterion, but, in the first place, of defining one. Lacking 
more suitable standards, in the typical wage evaluation project, job 
measurements are evaluated in terms of how well they reproduce the 
existing wage structure in the plant or the wage structures of other 
similar plants in the area. Both procedures obviously have serious 
shortcomings. 

In the project described here, wage survey data for similar jobs 
outside the plant were assembled with the expectation that these data 
might be employed as a “‘criterion.’’ Preliminary tabulations made it 
quite obvious that these data were incapable of generating correlation 
with anything, even themselves; the differences in wages paid for what 
were presumed to be similar jobs were often times as large, or even 
larger, than those which existed between different jobs. Accordingly, in 
both studies, the plants’ prevailing rates were used as criteria in com- 
bining the scales values of the four skill scales. 

A multiple regression equation was written for predicting rates from 
scale values. The multiple R’s resulting from the application of the 
regression equation were .77 and .83 in Plants A and B respectively. 
In both plants, the Work Skills Scale contributed the most toward ac- 
counting for the total variance of “going rates.” Apparently then, the 
kinds of measurements made by paired comparisons can yield measure- 
ments which are capable of ordering jobs with respect to their worth. 
The results reproduced in Table 1 also reveal that even better scales 
might be developed; the skill scales obviously do not measure independent 
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dimensions. This would suggest: (a) that the original choice of the 
traits was a poor one; (b) that the traits were poorly defined; and/or (c) 
that the judges were not highly proficient in making the kinds of dis- 
criminations which this project called for. 


Table 1 


Intercorrelations of the Scale Values Derived for Each of Four Job Traits 
and Their Correlations with Rates 











Plant A* Plant B* 
Trait 2 3 4 5 3 4 











1. Educational Skills .93 —.49 .73 a 
2. Work Skills — .39 .75 vl 
3. Application Skills —.34 —.14 
4. Social and Per. Skills .66 
5. Going Rates 





* None of the inter-plant differences in the z-equivalents of the r’s attain statistical 
significance. 


Other characteristics of the scale values derived here may be pointed 
out. For one, the analyses suggest that these values are in general inde- 
pendent of the particular population of jobs chosen, i.e., that they have 
general validity. The correlations between the scale values of jobs in 
Plant A and those for jobs in Plant B, which the job analysis data indi- 
cated as similar in content, are presented in Table 2. These findings 
should be of special interest since they suggest that “standard scales” are 
feasible—that scales can be developed which will be of general applica- 
bility in job evaluation projects. 


Table 2 
Correlations between the Scale Values Derived in Plant A with Values Derived 
for Twenty-three Similar Jobs in Plant B 











Job Trait TAB 





Educational Skills 92 
Work Skills 92 
Application Skills 34 
Social and Personal Skills 91 





Further analyses of these data suggest high consistency in the judg- 
ments made by the several judges. Table 3 summarizes these findings. 
The coefficients reported in Column ri are average intercorrelations be- 
tween judges (5) and may be regarded as estimates of the reliability of the 
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Table 3 


Reliability of the Judgments on which the Scale Values were Based 











Plant A Plant B 


Jon Trait Tir TAA Tir TAA 





Educational Skills 805 993 
Work Skills 777 990 
Application Skills 623 979 
Social and Personal Skills 812 ¢ ¢ 989 





Table 4 


Correlation between the Sum of Judgments Made by Employee and 
Management Representatives 
Plant A Plant B 
Job Trait r’s r’s 








Educational Skills 96 99 
Work Skills 93 98 
Application Skills 92 94 
Social and Personal Skills 95 91 





individual judge’s judgments; those in the ra, Column are the estimates 
resulting from applying the Spearman-Brown prophesy formula to the 
rir values. Using comparatively large groups of evaluators obviously 
results in highly reliable judgments. These coefficients compare quite 
favorably with the few that are reported for ‘“‘point-evaluation”’ judg- 
ments in the literature (4, 6). Further, from the above it may be pre- 
sumed that the individuals who constituted the scaling committee were 
quite homogeneous in their outlooks toward the jobs which they evaluated 
—this, in spite of the fact that the committee membership was chosen 
to represent both employee and supervisory points of view. The cor- 
relations between the sums of employee and management judgments are 
reported in Table 4. This finding would suggest, then, that the attitudes 
of the judges who participate in a job scaling project (if we can assume 
that there were differences in the attitudes of the members of our groups) 
are not likely to color their judgments of the jobs. This finding is con- 
sistent with the findings of other investigators (1, 3) who have studied 
the scale values assigned to opinion statements by judges who differ 
pronouncedly in their attitude toward the object being investigated. 


Summary: The Method of Paired Comparisons 


In two investigations jobs were scaled on four traits by using the 
Method of Paired Comparisons. The results of these investigations 
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indicate that jobs can be scaled on these dimensions and that the measure- 
ments yielded by such scales can effectively be used to order jobs in a 
fashion which is valid for rate setting. The findings further suggest 
that the method used results in scale values which are independent of 
the particular population of jobs chosen. 

At the practical level, the methods employed are particularly well 
adapted for industrial usage: (a) They permit the participation of large 
numbers of evaluators; (b) they can be employed with comparatively 
naive evaluators i.e., little training time is demanded; (c) even untrained 
evaluators report little difficulty in making the judgments called for; 
(d) the judgments can be made with a minimum of supervision and 
follow-up review; and (e) the resulting measurements are highly reliable. 


II. Construction and Use of a Scoring Key for a Job Specification Form 


In the same plants in which the investigations described above were 
made, trained job analysts collected and summarized other job data; 
they prepared specification forms which in form and content were some- 
what like the Worker Characteristics Form employed earlier by the United 
States Employment Service in its job studies (11). The items, of which 
there were eighteen, covered various aspects of the skills and knowledges 
required by the jobs analyzed.* Each item was prefaced by a brief 


statement defining a particular skill (or knowledge) and this was followed 
by three or four altérnative phrases or statements descriptive of various 
degrees of skill. These alternatives were drawn up arbitrarily to definite 
approximately equal distances along the skill scale. The following is a 
sample item: 


Responsibilities for planning and laying-out work. 


. All work planned and laid out by the supervisor. 

. Particular class of tasks allocated to worker; lays out own schedule 
according to established routines. 

. Works on a job basis but has the responsibility for setting up own 
work operations and schedule. 

. Particular class of tasks allocated to worker; responsible for setting 
up own work operations and schedule. 


Collection of the Data. As in the case of the job description preparation, 
described in Section I, the ratings called for by the specification forms were 
made on a cooperative basis by the job analyst, the immediate supervisor, and 
by the employee performing the job. One hundred and three such forms 
(70 in Plant A and 33 in Plant B) were prepared. These data supply the basis 
for the analysis reported in this section. 

Analysis ‘of the Data. Collectively, the items of the job specification form 
cover the same subject matter that was dealt with in the two scaling projects 
described above, so it seemed reasonable to presume that the ratings reported 


3 See Table 4. 
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on the specification sheets might be turned to the same usage as the paired 
comparisons data—namely, to order jobs with respect to their worth. Accord- 
ingly, a scoring key was developed for these items. 

Each of the 18 ratings for the 70 jobs in Plant A was correlated with ‘‘going 
rates”’ and an equation was written for combining the “‘scores”’ of the individual 
items. In this equation, the individual item ratings were weighted in terms 
of their correlation with rates and the reciprocals of their respective standard 
deviations. 


Table 5 


Correlations between the Items of the Specification Form and Going Rates and 
the Standard Deviations of the Individual Item Ratings 








Item on Specification Sheet 





. Formal schooling demanded by the job 

. Skill in the use of numbers and numerical operations 

. Skill in the use of words—spelling and vocabulary 

. Skill in reading 

. Vocational training needed for the acquisition of job skill 

Training on the job 

Kind of supervision received on the job 

Responsibility for planning and laying out work 

. Responsibility for making decisions 

. Conditions under which work is performed 

. General nature of work—interesting, stimulating or routine and 
dull 

. Physical demands of the job 

. Supervision given to other workers 

. Relationships with other workers on the job 

. Relationships with persons outside the department 

. Skill in oral expression 

. Ability to maintain confidences 

. Appearance and dress requirements 
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Results 


As a check on the accuracy of the scoring key developed (i.e., the 
ability of the key to reproduce the criterion on which it was built), the 70 
specification forms were scored and the resulting scores correlated with 
rates. The coefficient was .89. Undoubtedly with further statistical 
manipulation of the item weights a larger proportion of the criterion 
variance could have been accounted for. The operation of correlating 
scores with the criterion on which the scoring key was originally built is, 
or course, no check of either the validity of the procedure nor its general 
usefulness. Accordingly, a similar set of specifications, which was de- 
veloped in Plant B by another group of job analysts, was scored with 
the key developed in Plant A; again the resulting scores were correlated 
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with rates. In Plant B, specifications correlated .92‘ with rates. Thus, 
when an independent criterion and a new population of jobs is employed, 
the scoring key is found to be quite satisfactory. 


Summary: A Scoring Key for a Job Specification Form 


The procedures used in the development of a scoring key for job 
specifications forms has been described. Such a scoring key was found 
to yield scores which are related to wage payments made to clerical 
workers. There is some evidence to support the conclusion that such a 
scoring key developed in one plant may be of general usefulness in 
evaluating similar jobs in other plants. 


Discussion 


The two approaches to job measurement described here may be com- 
pared and contrasted. As indicated above, they yield results which are 
very similar so that one’s choice between them would probably be 
governed by considerations other than one of accuracy or validity of 
measurement. First, it might be pointed out, the scoring key resulting 
from the application of Method Two, can be developed in a much shorter 
period of time primarily because the volume of data dealt with is much 
smaller; in contrast, the Method of Paired Comparisons, even when 
“short cuts’ are employed, is always cumbersome. Further, with 
Method Two, once adequate job analysis data have been collected, it is 
a comparatively simple task to collect the judgments called for by the job 
specification; but, it must be borne in mind that judgments of this sort 
can only be made by persons who have very intimate contacts with the 
jobs for which they are preparing specifications. Training of the evalu- 
ators might, of course, overcome this limitation. 

It might be argued then, that the Method of Paired Comparisons is 
more suitable for those kinds of projects where: (a) it is desirable to make 
the scaling project a cooperative one with comparatively large judging 
groups representing all interests, and (b) where one has a minimum 
amount of time to devote to the training of the judging group. Apart 
from the fact that paired comparisons data are generally highly reliable, 
and that the method has a well-established theoretical basis, the above 
characteristics, in many industrial plants, would strongly recommend 
this method. 

‘Note that this coefficient is slightly higher than the one obtained on the initial 


check validation. The difference in these two values does not attain statistical sig- 
nificance. 
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On the other hand, it is the writer’s opinion, that the scoring-key 
method may be particularily valuable in certain special circumstances. 
Once such devices have been developed, they may be of particular use- 
fulness in those situations where a comparatively small number of new 
jobs needs to be slotted into an already established wage structure. Or, 
again, where the manufacturing unit is so small as to make other more 
elaborate procedures impractical. The scoring-key method can easily 
be used as a supplement to any of the commonly used job evaluation 
schemes. 


Received October 14, 1948. 
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The Effect of Equating Interest Test Items for Prestige Value 
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A number of the currently used vocational interest scales require the 
subject to choose between pairs or groups of occupational titles or activi- 
ties! The assumption is made that degree of interest in a given field may 
be measured by the frequency with which one selects items that fall in 
this field over the other items with which they are grouped. 

Interest scores obtained in this way should be most reliable when 
the items are matched for all factors that might influence choice except 
interests or personal values. Where choices are required between oc- 
cupational titles, for example, preferences might at times be determined 
by such factors as the prestige of the occupations or their monetary return 
rather than by interest in a general type of work. Thus, if an item re- 
quires a choice between the occupations of United States Senator and 
scientific laboratory assistant, a person of high scientific interests might 
choose Senator because of its far greater prestige value. At the present 
time, however, there is no experimental evidence of the effect that factors 
such as these exert on interest scores.? It is the purpose of the present 
study to determine the extent to which the factor of prestige can in- 
fluence such scores. The study originated in an attempted revision of 
the Allport-Vernon Study of Values (1). One type of item in the proposed 
revision consisted of pairs of occupational titles, the occupations being 
chosen to represent Spranger’s value categories. From these items a 
person’s score for a value category was to be determined by the frequency 
with which the occupations representing the category were preferred over 
those with which they were paired. It was in connection with the con- 
struction of this part of the test that the question arose concerning the 


1See, for example, the Kuder Preference Record (4), the Thurstone Interest Schedule 
(11) and the Occupational Interest Inventory (Lee and Thorpe (5). 

2 The advisability of holding such factors constant, however, has been recognized. 
For example, in the construction of the Occupational Interest Inventory (5), the activities 
in each item have been roughly matched for job level. 
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necessity of matching the occupations for prestige value. The experi- 
ment to be described is an attempt to answer this question.® 
The plan of the study involved the following steps: 


1. The selection of occupational titles that fall into the Spranger categories. 

2. The scaling of these occupations for prestige value by Thurstone’s method 
of equal-appearing intervals. 

3. The construction of an interest inventory in which the items consisted 
of pairs of occupational titles. From this scale, three separate scores could be 
computed for each Spranger value category. For example, one of the aesthetic 
interest scores was to be based on items in which aesthetic occupations were 
paired with other occupations equal to the aesthetic in prestige value. The 
second aesthetic score was to be computed from items in which the aesthetic 
occupations were higher in prestige value than the occupations with which they 
were paired. The third aesthetic score was to be computed from items in 
which the aesthetic occupations were lower in prestige value than the occupa- 
tions with which they were paired. If prestige influences occupational prefer- 
ences, then these three scores should differ significantly. 


The only factors systematically varied in this study were the prestige 
values of the occupational titles and the interest categories in which they 
belonged. Other factors that might affect choice, such as financial 
returns or social service value, were neither isolated nor controlled. It 
is assumed that the influence of such factors would be similar to the factor 
of prestige, as Anderson (2) has shown high correlations between these 
factors. 

Method of Selecting the Occupational Titles 


The primary concern in the selection of the occupational titles was 
that they could be unambiguously classified into the Spanger value 
categories. Five interest categories were used: theoretical, economic, 
aesthetic, political, and social-religious. We decided arbitrarily to con- 
solidate the social and religious values because (1) it was impossible to 
find a sufficient number of distinct occupations that fitted in the religious 
category and (2) both social and religious occupations seemed to involve 
humanistic and social-service interests and activities.‘ 


3It is well known that in certain situations prestige is an important determiner of 
choice. The pioneer study of Moore (8) demonstrated that students’ preferences for 
grammatical expressions, ethical situations and musical dissonances were influenced by 
knowledge of expert opinion. Studies by Marple (7), Sherif (9) and others have con- 
firmed Moore’s findings. In these studies, the opinion of experts was made explicit in 
the experimental procedure by assigning one of the choices or statements to the authority 
in question. In the present study, the factor of prestige operates in an entirely different 
way as it is inherent in the item. 

‘ Correlational and factorial studies of scales measuring the Spranger values have 
yielded marked differences in the correlations between social and religious value scores. 
VanDusen, Wemberly and Mosier (12) report a correlation of .61; Ferguson, Humphreys 
and Strong (3) a correlation of .22. Lurie (6) reports a factor with high loadings on 
both scales. Even though social and religious values may be somewhat distinct, occu- 
pations that fall in the religious category seem to entail social service activities as well 
as high interest in spiritual values. 
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In searching for titles that could be classified in this manner, it soon became 
apparent that the list would have to be limited to professional, sub-professional 
and business occupations. These are the only types of occupations that seem 
to represent the Spranger values in an unambiguous manner. Skilled, semi- 
skilled, clerical and many other types of occupation had to be excluded. For 
example, the profession of artist seems clearly to fall into the aesthetic category 
whereas the occupation of house painter clearly fits neither the artistic, eco- 
nomic, nor any of the other categories. Consequently, the prestige range 
covered by the occupations chosen represents only a small fraction of the entire 
occupational prestige range. The range, however, is representative of that 
covered in certain existing interest scales. 


One hundred occupational titles were selected, 20 to represent each 
of the five value categories. 


Method of Construction of the Psychophysical 
Occupational Prestige Scale 


Thurstone’s method of equal-appearing intervals was followed in 
scaling the 100 occupational titles in respect to prestige value. 


Fifty students in an advanced undergraduate class in experimental psy- 
chology served as judges. The 100 titles were printed on separate cards, 
arranged in random order, numbered, and a set was presented to each judge 
with the instructions to sort them into seven piles with apparently equal 
intervals between them. A seven-step scale was used instead of the traditional 
eleven-step scale since it was believed that with occupations as homogeneous 
in respect to social prestige as the ones chosen it would be impossible to dis- 
criminate eleven steps. With this exception, Thurstone’s procedure was 
followed in determining the median and Q values for each occupational title. 
These values are presented in the second and third columns of Table 1. 


Table 1 


High, Median, Low and Mean Scale Values of the Occupational Titles 
in each Interest Category * | 





Scale Values 





Interest Category High Median Low Mean 





Political 0.60 1.91 6.35 2.45 
Economic 1.75 4.45 6.45 4.43 
Theoretical 1.72 3.34 5.22 3.37 
Aesthetic 1.78 3.52 6.24 3.77 
Social-Religious 2.25 3.86 5.70 4.01 





* To reduce printing costs, Table 1 is presented here in greatly abbreviated form. 
The complete table, showing median scale values, Qs, and per cent unambiguous agree- 
ment in classification into value categories for each of the 100 occupational titles, has 
been deposited with the American Documentation Institute. For the six pages in- 
volved, order Document 2624 from the American Documentation Institute, 1719 N 
Street, N.W., Washington 6, D. C., remitting $0.50 for microfilm (images 1 inch high 
on standard 35 mm. motion picture film) or $0.50 for photocopies (6 x 8 inches) read- 
able without optical aid. 
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Inspection of Table 1 shows rather marked differences in the prestige 
values of the occupations representing the various Spranger values. On 
the whole, the political occupations ranked very high in prestige. The 
nine highest ranking occupations belonged in this category. The means 
of the median scale values for the occupations in each category are as 
follows: political, 2.45; theoretical, 3.37; aesthetic, 3.77; social-religious, 
4.01; economic, 4.43. These values of course refer only to the occupa- 
tions selected in this study. 

Since the political occupations ranked so very high in respect to 
prestige, this category had, later, to be excluded. There were simply 
not enough low-ranking political occupations to pair with others in con- 
structing the interest scale. 

The Q values (double the usual semi-interquartile range) of the 
occupations ranged form 0.5 for prime minister to 2.7 for poet. The 
mean ambiguity value was 1.72. These Q values are large compared 
with those found in the construction of an attitude scale. The size of 
Q is undoubtedly a function of the homogeneity of the items, but at the 
same time it also represents the fact that in our culture, professional and 
business occupations do not fall into a strict hierachy in respect to 
prestige. 

In the construction of the interest scale, occupations were not dis- 
carded on the basis of high Q values. The justification for this procedure 
was that the scores for each interest were to be based on a fairly large 
number of items. Consequently, individual differences in susceptibility 
to the prestige of the individual items should cancel. 


Method of Checking the Accuracy of Classification 
of the Occupational Titles 


In order to check on the accuracy of the classification of the 100 occupa- 
tional titles into the Spranger value categories, 50 advanced undergraduate 
students were asked to sort them into the five categories. They were given a 
list of the 100 titles arranged in random order, descriptions of the five interest 
or value types adapted from Vernon and Allport (13) and they were asked to 
classify each title in one of the interest categories whenever this was possible. 
Such classifications will be referred to as unambiguous classifications. If an 
item seemed to fit into several of the categories, it could be placed in each. 
In such instances, the student was asked to indicate whether it seemed to fit 
equally well into both categories, or whether its placement into one seemed 
somewhat more suitable than the other. Such classifications will be referred 
to as coordinate classifications. These results are presented in Table 1 in the 
fourth column which shows the per cent of subjects placing the occupations in 
the designated value category. 


The results indicated that there was fairly close agreenemt among 
the subjects concerning the proper classification of the majority of the 
occupational titles. 
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Eighty of the occupations were placed in the same interest category 
by at least 80% of the judges. Of these 80 occupations, 67 were un- 
ambiguously placed in the same category by at least 80% of the raters. 
That is, at least 80% of these raters indicated no coordinate category for 
these 67 occupations. Thirteen additional occupations were placed in 
the same interest category by at least 80% of the judges, but a small 
proportion of the raters indicated a second coordinate category into which 
the occupation might also fit. These 13 occupations are designated by 
an asterisk. The remaining 20 occupational titles yielded less than 80% 
agreement and were therefore eliminated in the construction of the in- 
terest scale. These 20 occupations are designated by a double asterisk. 

The eighty occupations that met the criterion of 80% agreement in 
classification were distributed as follows among the five value categories: 
16 theoretical, 19 economic, 17 aesthetic, 14 political and 14 social- 
religious. 

Method of Constructing the Interest Scale 


The two preliminary steps provided facts concerning the prestige 
values of the occupational titles and the degree to which each fitted the 
modified Spranger categories. The next and major task was to con- 
struct an interest inventory from which it would be possible to determine 
whether the prestige value of an occupation is a factor that will influence 
interest scores. 


After eliminating the occupational titles in the political category and those 
titles that did not meet the criterion of 80% agreement in classification by the 
judges, 66 titles were available for constructing the scale. 

The completed inventory was composed of 120 items, each item consisting 
of two occupational titles. Sixty of the items contained titles of equal prestige 
value, equal prestige value being defined as a difference of less than 0.50 points 
on the seven-point prestige scale. The mean difference in prestige value in 
these items was 0.19 points. The remaining 60 items contained titles that 
differed from each other by .60 points or more in prestige value. The mean 
discrepancy for these items was 1.53 points. 

The inventory was constructed in such a way that the following four scores 
could be computed for each interest category: 


1. An Equal Score. This score was derived from the 60 items in which the 
prestige values of the occupations making up the items differed by less than 
0.50 points on the 7-point prestige scale. In this part of the scale, each interest 
category was compared with each other category ten times. That is, for 
example, 10 items involved comparisons of T and E titles; 10 involved com- 
parisons of T and A titles; 10 involved comparisons of T and SR titles, ete. 
The maximum possible equal score for an interest category was 30. 

2. A Favored Score. This score was derived from 30 of the 60 items in 
which the prestige values of the occupations differed by more than .60 points 
on the prestige scale. Here each interest category was compared with every 
other 5 times. The maximum possible favored score was therefore 15. 

3. The Opposed Scores. These scores were derived from the remaining 
30 items in the same manner as the favored scores except that here prestige 
operated against the selection of the occupations in an interest category. 
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4. The Unequal Scores. The unequal score was the sum of the favored and 
opposed scores, and represents the score for an interest category derived from 
the 60 items in which the prestige values of the titles differed. The maximum 
possible unequal score for an interest is 30. If prestige influences occupational 
preferences, the unequal interest scores should be more alike than the com- 
parable interest scores derived from items in which the prestige factor is held 
constant. 


Administration of the Interest Scale 


The 120 items were arranged in random order, mimeographed, and the 
inventory was administered to 275 students in first-year psychology 
classes. Of the completed inventories, 180 were chosen for analysis, 90 
from men and 90 from women students. For each student the 16 scores 
that have been described were determined, namely: 1. T, E, A and SR 
equal scores; 2. T, E, A and SR favored scores; 3. T, E, A and SR op- 
posed scores; and 4. T. E, A and SR unequal scores. 


Analysis of Interest Scale Results 


Three types of analysis were undertaken to determine the effect of 
prestige value on occupational choices. The results of the three analyses 
are entirely consistent and all three show that prestige has no effect what- 
ever on the interest scores. 


1. The first type of analysis involved computing correlations between 


the equal, favored, opposed, and unequal scores for each interest category 
in order to determine whether scores based on the various types of items 
are in agreement. These correlations are shown in Table 2. They have 
been computed separately for men and women. 


Table 2 


Correlations Between the Equal, Favored, Opposed and Unequal Scores for 
Each Interest Category 








Theoretical Economic Aesthetic Social-Rel 





Scores Men Women Men Women Men Women Men Women 





Equal and Unequal 93  .89 9 &§ St 
Equal and Favored 82 ~=.86 - ae 82 .82 
Equal and Opposed 85  .84 89. 66  .79 
Favoredand Opposed .62  .77 80. 50 = .67 





Scores on the equal and unequal scales are highly consistent. For 
men, the equal and unequal T scores correlate .93, the E scores, .95; the 
A scores .87 and SR scores .87. The corresponding correlations for 
women are .89, .90, .87 and .84. Scores on these scales are based on 30 
items. 
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It is apparent that these interest scores are highly consistent whether 
based on items in which prestige cannot contaminate the scores or on 
times in which prestige might exert some influence on vocational choice. 
These high correlations also indicate high reliability for the scales. 

The correlations between the favored and opposed scores are lower, 
ranging from .47 to .80. It must be remembered that these scores are 
based on only 15 items. 

The method of correlation will show only whether scores on two 
scales are similar in rank order. It does not indicate whether one set of 
scores is numerically higher than the other. In order to determine 
whether the various scores for an interest category were directly com- 
parable, a second type of analysis was undertaken. 

2. The second method of analysis consisted of directly comparing the 
interest scores based on the equal, favored, opposed and unequal scales. 
To facilitate comparison, the raw scores were first converted into per- 
cents.’ These percent scores then represent the proportion of times that 
titles in the interest category under investigation were preferred to the 
titles with which they were compared. These scores are presented in 
Table 3. The scores for men and women are presented separately as 
the two sexes differed in their preferences. 


Table 3 
Equal, Favored, Opposed and Unequal Percent Scores for Each Interest Category 








Theoretical Economic Aesthetic Social-Rel 








Scores Men Women Men Women Men Women Men Women 





Equal ‘ 60 31 49 66 48 
Favored 65 33 46 58 48 
Opposed 60 37 49 70 49 
Unequal 63 35 47 64 49 





It is evident that the factor of prestige has no significant effect on 
these percent scores. For the 90 men, for example, we find that the T 
titles are chosen 43% of the time from the equal scale, 44% of the time 
from the favored scale, and 39% of the time from the opposed scale. 
None of the differences are significant. 

Again, the SR titles are chosen 48% of the time from the equal scale, 
48% of the time from the favored scale and 49% of the time from the 
opposed scale. Again the differences, this time in the opposite direction, 
are not significant. The results for women are comparable. 


5 It should be remembered that the maximum raw equal and unequal scores were 30 
whereas the maximum raw favored and opposed scores were 15. 
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The large majority of the differences are not statistically significant. 
Of the 32 critical ratios presented in Table 4, only three are significant 
and here the differences are not in the expected direction. For example, 
there is a significant difference (C. R. equals 4.18) between the equal 
and favored SR scores for women. Reference to Table 3, however, 
shows that the equal percent score is 67 and therefore higher than the 
favored score of 55. 


Table 4 


Critical Ratios Between Equal, Favored, Opposed and Unequal Scores 
for Each Interest Category 








Theoretical Economic Aesthetic Social-Rel 








Scores Men Women Men Women Men Women Men Women 





Equal and Unequal 0.56 0.48 0.69 1.23 0.50 0.76 0.12 1.20 
Equal and Favored 0.25 0.03 1.29 0.66 0.83 2.43 0.18 4.18 
Equal and Opposed 1.19 0.87 0.07 1.75 0.02 1.07 0.37 2.10 
Favored and Opposed 1.39 0.83 1.22 1.05 0.84 3.48 0.53 6.32 





The differences in the four obtained scores for each interest category 
cannot be attributed to the factor of prestige as there is no general 
tendency for the favored scores to be higher than the equal scores nor 
are these generally higher than the opposed scores. Differences in the 
obtained scores are presumably due to (1) chance factors and (2) the 
particular titles that occur in the equal. favored, and opposed scale items. 
Although no analysis has been made of this last factor, it is obvious that 
certain occupations are generally more popular than others. This factor 
was not controlled in the construction of the scale. 

3. The third type of analysis consisted in computing for each indi- 
vidual a susceptibility to prestige score. This score was obtained from 
the 60 items in which the prestige values of the occupations composing 
an item differed. The score consists simply of the number of favored 
occupations selected minus the number of opposed occupations. The 
maximum possible range of this susceptibility to prestige score is from 
minus 60 to plus 60, a positive score representing susceptibility to prestige. 
The median susceptibility to prestige score for men was zero; for women, 
minus 6. It is obvious that there is no tendency to favor occupations 
high in prestige. ° 


Summary 


The results of these three analyses seem to show clearly that, insofar 
as college students are concerned, preferences for occupations within the 
range studied here are not determined by the prestige which is accorded 
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to that occupation. Although differences in prestige are recognized, as 
is demonstrated by the fact that occupations can be scaled for this vari- 
able, occupational “preferences are not determined by this factor. In- 
stead, preferences are apparently determined by far more basic interest 
patterns, and, as the consistencies of the scales suggest, these interest 
patterns constitute a relatively stable component of the personality. 
This, of course, has often been pointed out by Strong (10). 

It may, in addition, be safe to conclude that in the construction of 
professional interest scales when occupational titles make up the items, 
the prestige values of the items may be ignored. 


Received October 8, 1948. 
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Personal Preference Differences among Occupational Groups 
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This study was conducted in order to explore the occupational pat- 
terns resulting from the use of a new measure of preference developed 
by one of the authors. The mean scores of twenty occupational groups, 
ranging from unskilled labor groups to highly professionalized vocations, 
were compared with the average scores of a group of unselected men. 
There has also been investigation of the differences in mean scores among 
three occupational levels. The three occupational levels are approxi- 
mately those included in the major occupational groups of the Dictionary 
of Occupational Titles (1), as 0—Professional and Managerial occupations, 
1—Clerical and Sales occupations, 4 through 7, Skilled and Semi-skilled 
occupations. Results reported here are to be considered as suggestive, 
rather than conclusive, since an earlier abbreviated, and less reliable, 
form of the test was used in making the study. 

The new measure of preferences, entitled Preference Record—Personal, 
consists of five scales. Each of the scales has been developed so as to 
have high correlations among the items comprising a scale, and low cor- 
relations among the scales. The content of the scales may be described 
as follows: 


A. Preference for taking the lead and being in the center of activities 
involving people. 

B. Preference for dealing with practical problems and everyday 
affairs rather than interest in imaginary or glamorous activities. 

C. Preference for thinking, philosophizing, and speculating. 

D. Preference for pleasant and smooth personal relations which are 
free from conflict. 

E. Preference for activities involving the use of authority and power. 


It may be noted that the scales are based on recorded preferences. There 
is no implication intended that the scales measure actual facility in the 
areas described. 

231 
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The sample of unselected men is composed of the first 450 respondents 
to the random sampling as described in the Manual (2). The only cri- 
terion for inclusion in this group was that the test blank be filled out in 
accordance with the instructions. For the sake of convenience we shall 
hereafter refer to this population of unselected adult males as the “base 
group.” 

Our occupational samples have been drawn from an additional group 
of more than 1000 returned Preference blanks obtained through the 
original sampling referred to above. Information as to the vocation of 
the subject was obtained from the personal data section of the test blank 
wherein he was requested not only to name but also describe his work. 
This double requirement of the subject that he both name and describe 
his employment made it possible to check doubtful titles and descriptions 
against those of the Dictionary of Occupational Titles, and thus increased 
the validity of our occupational groupings. For example, when job 
descriptions were checked we frequently found the title “Accountant” 
self-conferred on a “Bookkeeper.” 

The count and classification of the occupations represented in our 
sample of more than 1000 revealed more than fifty different occupations 
reported by our subjects. It was decided that those occupations rep- 
resented by twenty or more cases could be analyzed for the purposes of 
exploration. Accordingly, the test blanks of the members of the twenty 
such occupations were isolated for study. ‘Table 1 lists the occupations 
grouped according to the appropriate level. There were 577 cases rep- 
resented in the twenty occupations which could be grouped as follows: 
Professional and Managerial, 10 occupational groups with a total of 
298 cases; Clerical and Sales, 5 occupational groups and 130 cases; and 
Skilled and Semi-Skilled, 5 occupations and 149 cases. Table 1 gives 
the number of cases for each occupational group. 


Procedure 


Mean scores and standard deviations for the base group of unselected 
adult males were computed for each ‘of the five scales of the Preference 
Record—Personal. For each occupational group, only means were com- 
puted since, for purposes of testing the significance of mean differences, 
the variance of the unselected group, was considered a much better esti- 
mate of the population variance than the variances obtained from the 
comparatively small occupational groups. Comparisons were then made 
between the means for a particular occupation and those of the base group. 
By such comparison we could observe the difference between the average 
member of an occupation and the average member of a general group 
with reference to the scale in question. In order to observe the effects 
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Table 1 


Mean Scale Scores for Base Group, Occupations and Occupational Levels for 
Preference Record—Personal 


Group N A B C D E 








z o z o z o z go z og 
Base Group 450 10.06 4.1 13.35 3.2 11.673.5 18.68 5.2 14.16 3.9 





Professional and 

Managerial 298 10.1 13.4 12.1 19.8 15.0 
Accountants 35 10.0 13.0 12.1 19.1 15.1 
Bus. Managers 25 9.4 13.2 13.0(+) 17.8 16.4(+) 
Chemica! Engrs. 21 9.9 14.7(+) 11.6 20.0 15.3 
Mechanical Engrs. 29 «(9.7 13.9 11.5 21.1(+) 14.7 
Office Managers 21 83(-) 138.2 13.0(+) 21.2(+) 15.0 
Personnel and Coun- 

scling Workers 23 #11.6(+) 14.1 14.3(+) 20.1 16.4(+) 
Plant Managers 24 «9.5 - 13.0 12.1 18.9 14.3 
Retail Managers 56 10.3 13.4 11.5 19.3 13.7 
Sales Managers 26 11.5(+) 11.8(-—) 12.6 18.8 16.5(+) 
Teachers 38 10.0 13.9 11.1 21.4(+) 148 


Clerical and Sales 130 10.3 12.9 12.0 19.3 14.4 
Acct. Clerksand Tellers 35 8.7(—) 13.4 12.6 19.0 14.1 
Gen. Off. Clerks 22 9.6 13.1 11.1 20.4 13.6 
Insurance Slsmn. 23 «#11.8(4+) 13.3 11.8 21.7(+) 144 
Salesmen other than to 

Consumer 27 12.8(+) 114(-—) 11.7 18.9 14.9 
Salesmen to Consumer 8.7(-—) 13.3 12.5 17.0 15.4 


Skilled and Semi-skilled 
Trades 149 9.9 13.6 11.3 18.4 13.4 
Carpenters 24 7.9(-) 143 10.2(-—) 17.4 13.2 
Electricians 33 10.3 13.6 11.2 18.5 14.0 
Factory Workers 50 10.4 13.4 11.4 18.8 12.3(—) 
Foremen, Mfg. 20 11.6(+) 138.2 11.4 18.6 14.9 
Telephone Linemen 22 8.5(-—) 13.4 12.4 18.1 13.7 





+ means significantly greater than the base group at the 5% level. 
— means significantly lower than the base group at the 5% level. 


of occupational level on test scores, we have also computed the mean 
scores for each of the occupational levels on the five scales, and studied 
the differences found between the three major groups. The purpose of 
this procedure was to determine the relation between enjoyment of a 
certain type of activity and the position in the occupational hierarchy. 
For example, are the average scores of professional people higher on the 
scale referring to preference for use of authority than those found for 
workers in the trades? Or, do clerical and sales employees indicate less 
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favorable attitudes toward taking the lead than do professional and 
managerial? 

The significance of differences obtained was estimated in terms of the 
standard error of the difference between means. However, the standard 
deviation used in computing the standard error was that of the base 
group, rather than that of the occupational group in question, since this 
appeared to be a more accurate determination of the standard deviation 
of the universe than would any single sub-sample. The N in each such 
comparison was that of the occupational subsample.' 


Results 


The means and standard deviations for the base group sample are 
shown in Table 1. Also in Table 1 are shown the mean scores for the 
three occupational levels, (1) Professional & Managerial, (2) Clerical & 
Sales, (3) Skilled and Semi-Skilled Trades. These three levels were 
compared with each other rather than with the base group of unselected 
men. Inspection of Table 1 indicates substantial differences between 
the base group and one or another of the three occupational levels did 
occur on several scales. However, the significance of these differences 
has not been computed. In Table 2 we have listed differences between 
the mean scale score among the three occupational levels, and for those 
differences found to be above the 5°% level, the critical ratio is shown in 
parentheses. 

Discussion of Results 


Any interpretation of results found in the study must be prefaced by a 
reminder of the small size of the occupational samples. Analysis of the 
data scale by scale yields information about both the scale and the atti- 
tude of the various occupations toward the activity embraced by the 
scale. 

Scale A. The items on this scale relate to a preference for taking the 
lead and being in the center of activities involving people. Inspection 
of Table 1 shows that more significant differences between occupational 
groups and the general population sample occurred on this scale than any 
other. The mean score for ten occupational groups differed significantly 
from that of our base group sample. Five occupations indicate highly 
favorable attitudes toward the activities involving social leadership: (1) 
Personnel & Counseling Workers; (2) Sales Managers; (3) Insurance 
Salesmen; (4) Salesmen other than to Consumer; and (5) Foremen. The 


o. 


450 


1 The formula used was: SEpit. = \ 


2 
+ Wy» where o, is the standard deviation 


of the base group. 
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five occupations showing less-than-average enjoyment in these activities 
were: (1) Office Managers; (2) Accounting Clerks & Tellers; (3) Salesmen 
to the Consumer; (4) Carpenters; and (5) Telephone Linemen. Two of 
these occupations which showed less than average interest in leading 
people or being in the center of a social situation are at first glance 
surprising. We refer to the Salesmen to Consumer group and the Office 
Manager group, both of which might be expected to indicate average or 
above-average enjoyment of people. An examination of the personal 
data and job descriptions of members of these two groups throws some 
light on their attitudes not indicated by their job titles. We find among 
the Salesmen to Consumers, a large number of individuals who report 
physical handicaps or old age or failure in some other line of work. A 
warranted generalization would seem to be that this group of salesmen 
did not select their vocation but were forced to it through inability to 
function in another profitable field. Therefore, it appears the Salesmen 
to Consumer group, comprising door-to-door salesmen and canvassers, 
and not retail store clerks (as the title might imply), are atypical of sales- 
men in general. Perusal of the data yielded by the group of Office 
Managers suggests that many of them are actually little engaged in 
working with people. Rather they list as their duties “ordering supplies,” 
“checking incoming orders,” ‘“‘making work schedules,” ‘reviewing re- 
ports,” “coordinating.”” After checking these job descriptions there is 
less mystery in the responses these two groups have made on Scale A. 

The three occupational levels do not differ significantly in their pref- 
erence for taking the lead and being in the center of things involving 
people. Preference for these activities characterize skilled and semi- 
skilled trades as much as professional and managerial occupations. That 
the skilled group scored as high as it did may be attributed to the signifi- 
cantly higher mean score for the 20 foremen. 

Scale B. The content of this scale relates to a preference for activities 
of a practical nature, rather than imaginary or glamorous pursuits. 
When comparison is made between the various occupational groups 
and the base group, we find in Table 1 that there were three occupational 
groups showing significant differences. Chemical Engineers show a strong 
preference for practical activities, while Sales Managers and Salesmen 
other than to Consumer are less interested in these matters than is the 
average man. A number of occupations which might be expected to 
show high preference scores failed to do so for these samples. A glance 
at Table 2 shows that the group of occupations labelled “Skilled & Semi- 
Skilled Trades’ have a mean score that is significantly higher than that of 
Clerical & Sales occupations. No other significant difference between 
the occupational level groups was found for Scale B. 
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Scale C. This scale may be described as preference for “thinking’”’— 
thinking of a philosophical or speculatiye nature. Significantly high 
mean scores were found for three occupations: Business Managers, 
Account Clerks and Tellers, and Personnel and Counseling Workers. 
The only significantly low mean was that for Carpenters. While the 
results for Personnel Workers and Carpenters may be in accordance with 
the hypothesis concerning the trait measured, those for the other two 
groups are not. Moreover, the lack of high scores for Chemical and 
Mechanical Engineers and Teachers, all of whom deal with abstract 
and conceptional ideas, does not seem consistent with the identification 
of this scale as thinking philosophizing, and speculating. More cases 
and data on additional occupations are needed before this trait can be 
definitely identified. 


Table 2 


Differences between Means of Three Occupational Levels on the Five Scales of 
Preference Record—Personal * 








Prof. and Man. Cler. and Sales 
Prof. and Man. Minus Minus 
Minus Sk. and Semi-Sk. Sk. and Semi-Sk. 
Cler. and Sales Trades Trades 





—.193 209 402 


516 —.150 — .666 (1.95) 
103 820 (2.37) -717 (1.94) 
422 1.382 (2.67) .960 (1.74) 
579 1.668 (4.32) 1.089 (2.65) 





* Figures in parentheses represent the critical ratio of the significant differences 
found between means. 


Table 2 indicates that preference for activities measured by Scale C 
is related to occupational level. We observe that Professionals are higher 
than Clericals, but this difference is not significant. However, both 
Professionals and Clericals are, on the average, more favorable toward 
these activities than is the average Trade worker, and this difference is 
found to be beyond chance expectations. 

Scale D is designed to measure the individual’s preference for activities 
of an agreeable nature—activities free from conflict. The occupational 
group showing the highest enjoyment in pursuits of an agreeable nature 
was Insurance Salesmen, consistent with the stereotype of this group 
as highly amicable. The next three groups, in order of mean score, were 
Teachers, Office Managers and Mechanical Engineers. No occupational 
group yielded a mean score significantly lower than that of the base broup. 

We find here on Scale D differences among the occupational levels in 
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their mean scores. The average Professional is higher than the Clerical 
according to our figures, but the difference is small enough so that it can 
be attributed to chance factors. However, we find the critical ratio of 
the difference between Professionals and Trades of such magnitude as to 
be called very significant, and the differences between Clericals and 
Trades significant. We must assume that the average man working in 
the skilled or semi-skilled trades can be expected to be considerably less 
interested in activities of a pleasant, amicable nature than white collar 
or college-bred men. 

Scale E’s items relate to enjoyment of the use of authority and power. 
High mean scores that show important and real differences from the mean 
of the base group are to be found only in the Professional & Managerial 
Group. Business Managers, Personnel & Counseling Workers and Sales 
Managers say that they certainly do like to exercise power. Lawyers 
are being found to be significantly high on this scale in a study made 
after the one reported here. The only group which indicated less than 
average satisfaction in these activities were Factory Workers. It is 
perhaps appropriate to describe here the composition of the group 
’ “Factory Workers.’”’ These were respondents to the random sampling 
who stated they worked in a plant or manufacturing concern and who 
described their jobs by mentioning a single simple function repeatedly 
performed, such as one phase of an assembly procedure. The group 
is heterogeneous in that a great many different industries are repre- 
sented, but they are similar in that their work was described as taking 
place in a factory where they performed a single mechanical or motor act. 
The degree of skill ranged from unskilled to semi-skilled. 

Analysis of mean scores by occupational level indicates that the 
“trait”? measured by Scale E is related to position on the occupational 
ladder. Professionals & Managerial Workers are higher than Clerical & 
Sales Workers, in general, in their preference for pursuits using authority. 
The critical ratio, however, is only 1.43. A difference as large as this 
in this direction could occur about 8% of the time through the operation of 
chance. But Professionals are sufficiently higher in their mean score 
than Trades that we can say the difference is too great to have arisen by 
chance more than once in a thousand. The difference between Clericals 
and Trades is significant at the 1% level. 

The discussion of results up to this point has been from a “‘vertical”’ 
standpoint, i.e., the results have been examined in terms of the scales. 
A “horizontal” view of the results presents somewhat different informa- 
tion about the occupations and appears worthwhile. 

There were five of the twenty occupational groups which showed no 
significant deviation from the mean of our base group on any one of the 
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five scales. These occupations were Accountants, Plant Managers, 
Retail Managers, General Office Clerks and Electricians. With reference 
to the qualities measured by the Preference Record—Personal these people, 
or occupational groups, appear to be typical of the general population. 
It is interesting to note that all three occupational levels are represented 
in this group of occupations which show the same characteristics as the 
general population with regard to the variables measured. A study of 
larger groups may, of course, reveal significant differences. 

On the other hand, we find three occupations differed in their means 
from that of the base group on three of the five scales. These occupations 
showing atypical pattern were Personnel & Counseling Workers, Office 
Managers, and Sales Managers. These three occupations can be said to 
differ from the average man more than any of the other seventeen occupa- 
tions with regard to the characteristics studied. 

Of interest also are the results for the three occupational levels. We 
found no important differences between the two higher levels on any of 
the five scales. However, the Skilled and Semi-skilled group, as shown 
in Table 2, shows rather marked differences from the other two groups, 
Professional & Managerial and Clerical & Sales. 

In comparing the highest occupational level with the lowest we see 
that the preferences of Professionals and Managers are higher than those 
of Skilled and Semi-skilled trades workers for activities involving ‘‘philo- 
sophical thinking” (Scale C), pleasant relations (Scale D) and the use of 
authority (Scale E). 

Comparison of the group of Clerical & Sales occupations with those 
of the Skilled & Semi-skilled trades workers, reveals the same differences 
as those described in the paragraph above plus a difference in the opposite 
direction on Scale B. On Scale B, preference for activities of a practical 
nature, we observe in Table 2 that the trades occupations show a signifi- 
cantly higher mean score than the clerical group. 


Summary 


This study indicates that each of the five scales makes some dis- 
criminations by occupation, and that there is a relation between some 
occupations and the characteristic measured by each scale. 

Fifteen occupations differed on one or more of the scales. Five of 
the occupations did not differ from a general population sample on any 
of the scales. 

We found that differences also occurred that are related to the level 
of the occupation in the economic or educational scheme. These differ- 
ences were rather large and significant between the group of occupations 
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labelled Skilled & Semi-Skilled Trades and the group, Professional & 
Managerial, and also the group, Clerical & Sales. 
Received February 7, 1949. 

Early publication. 
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The OL Key of the Strong Test and Drive at the 
Twelfth Grade Level * 


Stanley R. Ostrom 


Department of Public Instruction, Dover, Delaware 


One of the baffling problems facing educators today is that of finding 
an instrument that will determine, with an acceptable degree of accuracy, 
which pupils possess the pattern of traits that enable them to make the 
best use of their abilities. If an instrument could be found that made it 
possible for a counselor to distinguish subjects whose backgrounds and 
native endowments were such that they could easily be activated to exert 
a maximum of energy from subjects whose backgrounds had pre-disposed 
a more lethargic set, it would be possible to predict scholastic and voca- 
tional success much more accurately than is now the case. 

The Occupational Level Key of the Strong Vocational Interest Blank 
for Men has been recommended by Strong (6, p. 195) and Darley (1, 
p. 60) as an instrument that will enable a counselor to make this distine- 
tion. Kendall (3) and Ostrom (4) have demonstrated that the OL key 
of the Strong Blank can be used with considerable confidence for this 
purpose at the College Freshman level. This paper reports an attempt 
to determine the utility of the OL key at the twelfth grade level. 

Two hundred twelfth grade boys enrolled in four Central New York 
high schools formed the sample. One-half of these boys cooperated in 
an intensive study and the total group participated in a study which 
utilized their academic aptitude as measured by the American Council 
on Education Psychological Test scores, drive! as measured by the OL 
key, and four year academic grade averages. 

The 100 boys who cooperated in the intensive study were selected 
in the following manner: from three of the four high schools a total of 
sixty boys were chosen so that twenty of them had very high scores on 


* This paper is one of a series reporting research in tools and techniques of counseling 
conducted at the Psychological Services Center at Syracuse University. It is a portion 
of a paper submitted as a Doctor’s Thesis under the direction of Dr. Maurice Troyer 
in partial fulfillment of the requirements of the degree of Doctor of Education in the 
School of Education, Graduate Division of Syracuse University, 1948. Other advisers 
to whom the writer feels deeply endebted are Dr. Milton E. Hahn, Dr. William E. 
Kendall, Dr. C. Robert Pace, and Dr. Eric Gardner. 

1 For purposes of simplicity, the pattern of traits discussed in the first paragraph will 
be represented in subsequent pages of this report by the term drive. 
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the OL key, twenty of them had very low scores, and twenty of them had 
scores that clustered around a scaled score of fifty. Thus, three groups 
which were differentiated by OL were obtained. In the fourth high 
school, forty boys were chosen in such a manner that their OL scores fell 
onacontinuum. This was done to determine whether or not spuriously 
high relationships between OL and the experimental variables would be 
obtained in the other three high schools due to sampling methods. 

Three new instruments were devised for purposes of checking on the 
results of the OL key. These instruments were: (1) a Teacher’s Rating, 
(2) an “Open End” interview, and (3) a “Guess Who” questionnaire. 
The Teacher’s Rating (see Figure 1) was produced to measure drive in 
the four following areas: (1) drive for hobby satisfaction, (2) drive for 
scholastic achievement, (3) drive for co-curricular achievement, and (4) 
drive for vocational attainment. Each of the four traits was measured 
on a seven-point scale with discrete descriptions utilized for each of the 
seven points.? 

The second instrument, a ‘‘Guess Who” questionnaire (see Figure 2) 
was devised in an attempt to determine how the young men felt about 
the drive and persistence of their peers. In this instrument ten de- 
scriptions were listed with space provided where each subject could name 
the three of his peers who best signified the quality required of each 
statement.® 

The third instrument was the interview (see Figure 3). The writer 
interviewed each individual, making use of the basic set of ten questions. 
The subject was permitted to elaborate on each question as much as he 
desired. From time to time, secondary questions were asked to en- 
courage the subject to enlarge on the response given to the primary 
question. By means of the ten questions, the writer attempted to elicit 
from the subject information from his background, his past school, 
work, and hobby experiences as well as his hopes and plans that gave 
evidence of the presence or absence of drive.‘ 

It was necessary to quantify the results of the three new instruments 
before they could be of any value in determining relationships between 
their results and those of OL. 

The Teacher’s Rating Scales were filled in by five teachers who had 
known each boy for at least one year. Each trait was measured on a 
seven-point scale, hence the maximum score obtainable on each trait 

2r = §9 + .03, N = 40. Test-retest method with two week interval. 


3r = 94+ .02, N = 40. Test-retest method with two week interval. 
4 No measure of reliability determined. 
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was seven, and the maximum score obtainable from all four traits was 
twenty-eight. 

Teachers vary in the ratings they give in two ways. First, some tend 
to rate all students relatively high while others are more conservative in 
their evaluations. Second, some raters are very discriminating in rating 
students, and the results they obtain vary over a large portion of the 
range; others are much less discriminating, and the ratings they give 
cover only a small portion of the range. 


Fic. 2.—‘‘Guess Who.” 


Following you will find a number of descriptions which have been listed. You will 
also be given a list of boys from your class. We are asking you to list the three boys 
from this list that best fit each of the statements. You are not asked to choose only 
your friends. The boys who best fit the descriptions may be boys you do not like very 
well. Thank you for your cooperation. 


1. The boy whom you feel will make the most of his abilities: 


. The boy 


. The boy 


. The boy on whom you would be most willing to bet in a boxing match if he were 
matched with a boy of equal size, strength, speed, and ability: 


. The boy who would be most apt to come back and win in a set of tennis if the score 
against him were 5-4 with the count in the final game being ‘‘Add”’ against him: 
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Fic. 2. Questions Used for Personal Interview. 
. Would you mind telling me what your father does for a living? 
. Do you know the highest grade (or degree) your father and your mother attained? 
. Would you give us a fairly complete picture of your work experience? 
. What do you expect to do when you are through with school? 
. Would you discuss your plans for gainiag the training required for that job? 
. Would you care to tell me how you got interested in 


. Would you say that hobbies have played any part in determining your vocational 
goals? If so, how? 


. Do you feel that you are satisfied with your school progress? 
. Would you say that your school work to date is a fair indication of your abilities? 


. What would you like to be doing in ten years? 


To correct for these two difficulties the ratings of all the participating 
teachers were converted into comparable measures. 

After the ratings had all been made comparable, the average rating 
was then changed to a T-score® (2, p. 99) so it could be utilized in 
further statistical procedures. 

In using the ‘‘Guess Who” device the boys were asked to list three 
boys from the group in their school whom they felt best satisfied each of 
the ten descriptions. It was possible for a boy to list one of his peers on 
several questions. This happened on numerous occasions. The scores, 
which were obtained by counting the number of times each boy was 
listed, ranged from four to ninety-three. The scores thus obtained were 
also converted to T-scores. 

The results of the interviews were quantified by the following method: 
the boy’s responses from each question were rated form one to four in 
terms of their expression of drive. A response that denoted much drive 


rin = (21) %e— [(22) ae —a8]. 


Where Xpa equals measurement in distribution B transformed into the terms of 
distribution A. 
Xs equals original measurement in distribution B. 
oa equals standard deviation of distribution A. 
os equals standard deviation of distribution B. 
Ms equals mean of distribution B. 
Ma equals mean of distribution A (2, p. 121). 
Since seven ratings were used, the mean score for distribution A was taken as the middle 
score or four. The standard deviation was arbitrarily set at 1.8. 
* For purposes of this study T-score is used in the sense of Walker’s Z score, thus not 
assuming normality. 
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was rated one; a response that denoted very little drive was given a value 
of four. The total of the ratings for all questions comprised the score 
for the interview. The boy with the lowest score, fourteen, thus meas- 
ured highest on drive in this measure. Table 1 shows the distribution of 
the four ratings for the 100 boys. These scores were also converted to 
T-scores but in a reverse manner so that low scores resulted in high 
T-scores. 

It was possible to correlate the results of the three original instruments 
with OL scores since as has already been stated, the scores obtained through 
the three instruments were all changed to T-scores. The correlations 
are indicated in Table 2. It is evident from the table that OL correlates 


Table 1 


Distribution of Ratings Given the 100 Twelfth Grade Boys on the 
“Open End” Interview 
Number of Per Cent 
Times Rating of 
Rating Was Used Total 











1. (Highest Value) 134 14 
2. 304 33 


3. 322 35 
4. (Lowest Value) 169 18 


Total 929 100 





Table 2 
Relationship Between Three Variables and OL in a High School Population * 








School I School II SchoolIII School IV Total 
Variables (N =29) (N = 16) (N = 15) (N =40) (N = 100) 





OL—Interview .56+.13 .39+.23 464.22 .56+.11 48+.08 
OL—Teacher’s Ratings .54+.14 .60+.17 .24-+.26 .28+.15 .41+.08 
OL—Guess Who 41+.16 A3+.22 .37+.24 .38+.14 .41+.08 
Teacher’s Rating 

Interview .72+.09 71+.13 71+.14 57+.11 .59 +.06 
Teacher’s Ratings 

Guess Who -74+.09 .73+.13 .55+.20 .56+.11 .61+.06 
Interview—Guess Who .57+.13 .38+.23 4A3+.23 .30+.15 .39+.08 
OL—Total T-Scores 

of the Instruments .56+.13 .538+.19 .30 +.25 .51+.12 .53+.07 





* Spearman’s Rank Difference formula was used in the first three schools due to the 
small number of students. In School IV and the Total, Pearson’s Product-Moment 
formula was used. 
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to a very significant degree with each of the three instruments and that 
it correlates to a highly significant degree with the total score which re- 
sulted when the T-scores of each of the three instruments were added. 
It will also be noted that the magnitude of the correlations obtained in 
School IV does not vary significantly from those obtained in Schools I, 
II, and III. This tends to show that choosing three groups of high, 
average, and low OL scores, as was the case in School I, II, and III, did 
not in this instance permit spuriously high results. 

As a further check, Chi Square was used to determine the relationship 
between OL scores and the scores obtained in the Teacher’s Ratings, 
“Guess Who’’, interviews, and total ratings. As can be seen in Table 3 
all four Chi Square results are of a magnitude that justify the rejection 
of the Null Hypothesis at the one percent level. 


Table 3 


Relationship Between OL and Three Variables * 
for High School Population 








Chi Confidence 
Variables, Total 88 Square Level 





Guess Who and OL 15.22 >1 
Teacher Ratings and OL 24.06 >1 
Interviews and OL 16.23 >i 
Total and OL 22.12 >1 





*The Null Hypothesis states that the three OL groups: high, average, and low, 
do not constitute different populations in terms of the “Guess Who” ratings, Teacher’s 
Ratings, interview results, and the total results obtained by adding the T-scores of the 
three variables for each boy. A chi square of 13.277 was necessary to reject the Null 
Hypothesis at the 1% level of confidence. 


Having found a relatively high relationship between OL and the in- 
struments described above, an attempt was made to determine the relation- 
ship between OL and school achievement as measured by school academic 
grade averages. 

The assumption on which the study was based was that excellence in 
school was to some extent determined by motivation or effort expended. 
To find this relationship the 200 boys from the four high schools were 
divided into two groups, the first being made up of boys with high OL 
and the second made up of boys with low OL. With these two groups 
the following two questions were posed: (1) do the two groups differ 
significantly in scholastic achievement? (2) if so, how much of this 
difference is due to OL? 
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Table 4 registered an F’-ratio of 5.66 which was of a magnitude that 
places the confidence level for the rejection of the Null Hypothesis be- 
tween the five and one per cent levels, thus answering the first question 
in a doubtful affirmative. To answer the second question it was necessary 
to adjust for the other variable, academic aptitude as measured by the 


Table 4 
Analysis of Variance of Honor-Point Ratios 
N = 100 Twelfth Grade Boys 








Degrees Test 
of Mean of 
Variance Freedom Squares Square F* Hypothesis** 





Within 198 5417.12 27.35 


Between 1 154.88 154.88 5.66 Remain in 
doubt 





Total 199 5572.00 





* Where F = greater mean square/lesser mean square. By referring to Snedecor’s 
tables of F (5, 222-225), we may use the following three rules in testing the hypothesis: 
(a) reject the hypothesis tested, if the calculated value of F is greater than the 1% point 
given in the tables; (b) accept the hypothesis tested, if the calculated value of F is less 
than the 5% point given in the tables; (c) remain in doubt, if the calculated value of F 
lies between the 5% and 1% points given in the tables. 

** The Hypothesis tested is a null hypothesis concerning the difference between 
means of groups, i.e., there is no significant difference between the means of groups. 
(The 1% point was 6.76 and the 5% point was 3.89.) 


Table 5 
Complete Analysis of Variance and Covariance—100 Twelfth Grade Boys * 
(Partialling out the Effect of Academic Ability) 








Adjusted 
Degrees Sum Surn or 
Source of of of Reduced Test of 
of Free- Squares Square Sum of Sum of Hypoth- 
Variance dom y? x? XY Squares . esis 





Within 

means of 

groups 198 5417.12 18722.70 4931.12 4118.35 
Between 

means of 

groups 1 154.88 1039.68 401.28 14.85 





Total 199 5572.0 19762.38 5332.40 4133.20 





* See footnotes for Table 4. 
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American Council on Education Psychological Examination. When the 
data were adjusted for academic aptitude by means of covariance, as 
shown in Table 5, an F-ratio of only .7 emerged. Since this was not of 
a magnitude to justify rejecting the Null Hypothesis, the answer to 
Question 2 must be that the difference in academic grade averages due 
to OL was almost negligible. 


Summary 


1. A definite relationship was demonstrated between OL on one hand, 
and out-of-school and co-curricular evidences of drive on the other hand. 
Thus it appears that boys who evidence much energy and activity in the 
less formal school situations and in everyday life situations as a rule give 
responses on the Strong Blank which result in high OL. 

2. No relationship was demonstrated between OL and high school 
academic grade averages. The reasons for this can be only conjecture 
but a few of them are ventured. It might be that high school does not 
present a challenge to most boys with the result that marks which enable 
a boy to “‘get by” are satisfactory. The possibility that boys satisfy their 
desires to achieve through co-curricular activities and life situations 
cannot be ignored. Furthermore, it is common knowledge that high 
school marks are not always valid. Questionable marks could easily 
cause a relationship to fail to emerge. It might be pointed out further 


that the use of the Strong Blank among high school students is question- 
able due to the immaturity of high school students. Strong has pointed 
out that interest patterns change quite extensively during the high 
school years. He states “roughly speaking, one-third of the change in 
interests is between 15.5 and 16.5 years, one-third between 16.5 and 18.6 
years, and one-third between 18.5 and 25 years (6, p. 259).”’ 


Received October 7, 1948. 
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An Objective Evaluation of Counseling 


Barbara A. Kirchheimer, David W. Axelrod, and 
George X. Hickerson, Jr. 
University of California Counseling Center, Berkeley 


The development of objective criteria for evaluating the effectiveness 
of counseling has traditionally been a matter of extreme difficulty. 
Apparently, the first studies in the evaluation of faculty counseling are 
the 1925 unpublished studies of Paterson and Langlie at Minnesota; and 
that of Lemon (6) at Iowa in the same year dealing with counseling by 
professionally trained counselors. Lemon’s work consisted of intensive 
remedial training for half of the lowest decile of students on the Iowa 
Qualifying Examination. At the end of three years, Holladay (5) sum- 
marizing Lemon’s study reported that the “‘counseled”’ group were making 
a better academic adjustment than the equally weighted group left to 
their own devices. However, Freeman and Jones (4) in a final report of 
the same group state that at the end of their college career there was 
no difference between the two groups, because academic failure appeared 
later for the experimental group. 

Use of the ‘‘spoon-feeding”’ type of counseling is shown in the studies 
of Cowley (3) with Ohio State Freshmen football players, Newland and 
Ackley (7) with high school sophomores and Williamson (9). In William- 
son’s study made on Art College students, he found as Paterson and 
Langlie had previously found with Engineering students, that the grade 
point average of probationary students was not improved by faculty 
counseling. Williamson concluded that grade point average is not ade- 
quate as a criterion of the effectiveness of counseling, or that other 
counseling methods must be used than those involved in his study. 

Two years later, Williamson (10) showed significant increases in 
honor point ratio for a student group counseled by trained counselors at 
the University of Minnesota Testing Bureau when compared with a 
matched non-counseled group. In a later study Williamson and Bordin 
(13) made use of subjective evaluations of adjustment and cooperation, in 
addition to grade point average. In a further paper on this same study 
(11) the authors show that both adjustment and grade point average are 
significantly better for a counseled group than for a matched non- 
counseled group. Since both these criteria are significant at the 1% 
level, one wonders why it was felt necessary to go beyond the grade point 
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average and use the subjective composite criteria. In criticizing the 
techniques for evaluating counseling, Williamson and Bordin (12) feel 
grade point average is a poor criterion because of the dissimilarity in 
pattern of subjects taken. However, the alternative of using stand- 
ardized achievement tests has limitations in comparing achievement in 
a number of areas. Moreover, the fallibility of the measuring instru- 
ment itself must be emphasized. 

Blackwell (2) in a client-centered counseling program at the Uni- 
versity of Texas reports significant increases in grade point average for 
a counseled group of 40 compared with a matched non-counseled group. 
Ward and Tyler (8) at the University of Oregon show a slightly better 
record for a counseled group than for a matched non-counseled group 
in grade point average, as well as on their special scale attempting to meas- 
ure college adjustment. Beaumont (1) in a somewhat confusing article 
purports to show that discrepancies in academic adjustment were due 
in a large measure to differences in academic counseling. However, he 
points out the fact that most ‘‘academic”’ counseling is more concerned 
with subjugating the individual to the academic machine than with the 
integration of the individual’s personality. 

In most academic settings, grades alone are an objective indication of 
progress or adjustment. In view of the fact that grades are the only 
specific criterion of which we are in possession, that they lend themselves 
to objective treatment, and that, with all their weaknesses, they are 
the accepted gauge of academic success or failure, the present authors 
have adopted this criterion as the most workable measure so far available 
whereby to evaluate the success of a counseling program. 

In evaluating the effect of counseling, an amplified approach might 
include considering the results upon grades of change of major course of 
study. A change of major often accompanies vocational and/or educa- 
tional counseling, and the effect of such a marked step is insufficiently 
investigated. From comparison of pre- and post-counseled grades, we 
may have some clues to the effectiveness of the change of major itself, 
and of the professional counseling which produced it. 


Selection of Groups and Methodology 


Accordingly, it was decided to study veteran students at the Uni- 
versity of California, Berkeley Campus. High admission requirements 
and fairly rigid disqualification regulations result in a rather homoge- 
neous, high caliber population. The average grades of undergraduate 
veteran students appear in Table 2. 

If any evaluation of counseling is to be made, the kind of counseling 
under investigation should be described. It is individual, consisting of 
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as many interviews as are required to develop mutually a vocational 
and/or educational plan, with all needed testing individually planned, 
and use of the occupational library maintained by the Occupational In- 
formation Specialist of the Center. The psychological training and ex- 
perience of the Counselors and Psychometrists, and the services of a 
Consulting Psychiatrist insure that each counselee’s total personality 
and situation is considered, rather than simple vocational or educational 
symptoms. Techniques are eclectic, with the constant objective of 
formulating an optimal, realistic plan. A coordinate objective is the 
growth of the counselee so that he may carry out the plan. Counseling 
is concerned with the development of the individual rather than with the 
improvement of grades. An important point is that the educational 
plan is a joint agreement between counselor and counselee. Counseling 
cannot be superimposed but must be the result of mutual understanding 
and real acceptance. 

Because of the dual approach in determining effect on grade average 
of counseling, and of change of major, with and without counseling, the 
following groups were used: 


I Counseled Change. Changed major as a result of a 
mutual decision of Counselee and 
Counselor. 


Il Non-Counseled Change. Changed major without any con- 
tact with the Counseling Center. 

III Counseled No-Change. Continued same major as a result 
of a mutual decision of Counselee 
and Counselor. 

IV Non-Counseled No-Change. Continued same major without any 
contact with the Counseling Center. 


Counseled groups were selected from the files of the Counseling Center of 
the University of California, Berkeley, operating under contract with the 
Veterans Administration for advisement of veterans. All veteran students 
included in Groups I and III received counseling under Public Law 346 (G. I. 
Bill), coming voluntarily to seek assistance for a variety of reasons. 

The University of California, Berkeley, Counseling Center has unfortu- 
nately been in existence only since October, 1946, and few cases of veterans 
who had changed major were available who had had at least one semester of 
enrollment prior to, and had completed one semester following, counseling. 
All records filed in chronological order were reviewed by a clerk with instruc- 
tions to select every case of a student enrolled as an undergraduate in the 
University at the time of requesting counseling, who signified intention at the 
last counseling interview of enrolling in a different Department, School, or 
College within the University of California the following semester. No case 
meeting these requirements was eliminated. 

There was considerable range in type of change. The only common change 
was from some form of Engineering (Mechanical, Electrical, Civil, and Indus- 
trial) to Business Administration, which accounted for six of the thirty-five 
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cases. Examples of other changes were Chemistry to Architecture, Physics to 
Social Welfare, Forestry to Agricultural Economics, Chemistry to Psychology. 

These students must have completed their counseling one full semester 
preceding the selection in order that grades following the change might be 
obtained. It had been hoped that grades two semesters before and two semes- 
ters after counseling might be obtained, but this was not possible at this time. 
Williamson’s ((11) interpretation of his data is that the effect of counseling is 
apparent in the first quarter following counseling, and no further increase in 
grades occurs in succeeding quarters. Unfortunately his hypothesis cannot be 
explored with our groups at this time. 

The number of our Counseled Change Group, for these reasons, was only 35, 
and for purposes of comparison other groups were constituted of equivalent 
size. Various studies have matched groups on the basis of intelligence, sex, 
age, etc. Williamson (12) points out that in such matching it is impossible to 
include such significant variables as motivation, personality, or emotional 
stability. Matching on intelligence test scores might have been desirable, but 
no such scores were available for those groups which had not gone through the 
Counseling Center. An unpublished study made at this Center by William R. 
MacKay compares the grade point average of those veteran students availing 
themselves of the Center’s facilities with the general grade point average of all 
veteran students at the University. This study showed no significant differ- 
ence in grade point average of the counselee, and also that on the ACE Psycho- 
logical Test, the average counselee score was at the 82.7 percentile (¢ 20.16). 
As already pointed out, the high admission requirements and fairly rigid 
disqualification regulations result in a rather homogeneous high caliber popu- 
lation within the University. The present authors, therefore, feel that a 
random sample drawn from the University population may be assumed to be 
roughly equivalent in intelligence and that no matching between groups was 
important other than that all subjects should be undergraduate male veteran 
students at the University of California, Berkeley. 

Only six changes of major were made by members of the Counseled Change 
Group between Fall 1946 and Spring 1947, and 29 such changes occurred 
between Spring 1947 and Fall 1947. For this reason, for the other groups the 
two semesters Spring 1947 and Fall 1947 were used for comparison, and are 
designated Ist and 2nd semester. 

The Non-Counseled Change Group was, like all other groups, collected by a 
clerk, who reviewed University alphabetical records of veterans for these two 
semesters, selecting the first 35 males who registered a change of major between 
these semesters, and who had at no time contacted the Counseling Center. 

The Counseled No-Change Group consisted of the first located 35 males in 
the Center’s files who were enrolled in the University as undergraduate students 
for the two semesters in question, and who did not in this period change their 
majors. All cases satisfying these criteria were retained. 

The Non-Counseled No-Change Group was selected in the same manner as 
Group II, except that the first 35 cases enrolled both semesters who did not 
change majors and who had not contacted the Counseling Center comprised 
this group. 

The college year distribution of the Counseled Change Group (Group I) is 
as follows: Freshmen 6; Sophomores 17; Juniors 11; Seniors 1. The year dis- 
tributions of the other 3 groups very closely approximated this, with very few 
students who were not divided between the Sophomore and Junior years. 

‘ — point average at the University of California is computed on the 
asis of: 
Three points per unit of credit for A, 
Two points per unit of credit for B, 
One point per unit of credit for C, 
No points per unit of credit for D and F. 
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The sum of the grade points divided by the number of units for which regis- 
tered yields the grade point average (G.P.A.). 

In the handling of our data, the significance of the differences obtained was 
calculated according to the formula for the critical ratio of the difference over 
standard error of the difference. A value of the critical ratio of 1.96 is reliable 
at the 5% level, and a value of 2.58 is reliable at the 1% level. 


Results 


The grade point averages for the two semesters studied for all four 
groups are given in Table 1. 


Table 1 


Summary of Grade Point Average and Changes 








Ist 2nd 
Semes. Semes. G.P.A. 
Number—35 Each G.P.A. o G.P.A. o Change 4 Changes Change 
1 2 ch. 





I. Counseled Change 113 61 165 : ‘ 66 —.70to 3.59 
C 
II. Non-Counseled Change Ly. SB iA. : .62 sae 1.70 
III. Counseled No-Change 146 63 1.54 56 .08 A abe 56 
IV. Non-Counseled No-Change 1.46 .62 . ; ae AT 
.76 





From Table 1 it may be noted that the Counseled Change Group im- 
proved from slightly better than a C average (1.13) to 2a B— average 
(1.65), or a gain of .52 grade points, a change which is significant at better 
than the 1% level. 

Since it was felt that grade point average may be affected by elective 
courses not pertinent to the major, a calculation was also made of only 
courses in or required by the major. Two cases were necessarily elimi- 
nated in this calculation because, although they had officially made a 
University transfer, they had not yet undertaken any courses in the new 
field. Incidentally, these two cases were among the seven whose grade 
point average was lowered following the statement of change. For the 
remaining thirty-three cases the mean grade point average in the major 
courses only before the change was .946 (a level of deficiency) and after- 
wards was 1.68, or an increase of .734 grade point on the average. 

Twelve students of the 35 received a grade point average of less than 
1.00 (deficient level) for their semester’s work prior to counseling, while 
only one student received a grade point average of less than 1.00 after 
counseling. A number of individual examples may be cited. One student 
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who had a .77 grade point average (down grade points) with C’s and D’s, 
under a new major the following semester rated three A’s and 1 B or an 
A— average (2.75). One student receiving 2 D’s and 2 F’s improved to 
four C’s. 

Whereas the Non-Counseled Change Group had only a slightly higher 
grade point average initially than the Counseled Change Group, its in- 
crease (.24) with a change of major was less than half as great, a change 
significant only at the 9% level. 

The change in grade point average of the Counseled No-Change Group 
(.08) and the Non-Counseled No-Change Group (-— .07), with Critical 
Ratios of less than 1, were not significant changes. 

It had previously been found by the Coordinator of Veteran Affairs 
of the University of California, Berkeley Campus, that grades were 
inversely related to size of study load, i.e., number of units carried. For 
undergraduate students the averages were as follows: 


Table 2 


Grade Point Average Compared with Average Number of Units Carried 








Study Grade Point 
Semester Load Average 





Fall, 1945 11.2 1.91 
Spring, 1946 12.5 1.57 
Fall, 1946 13.5 1.53 
Spring, 1947 14.1 1.37 
Fall, 1947 14.2 1.41 





It was felt, therefore, that such an increase as shown by Group I 
might be partially a result of a decreased study load. As can be seen in 
Table 3, the study load of the Counseled Change Group went up, and 
therefore, such explanation for their higher average must be rejected. 
The Non-Counseled Change Group on the other hand did decrease their 
study load slightly. However, both of the No-Change Groups were 
carrying a slightly heavier program in the second semester than were the 
change groups, but again the counseled group was slightly more heavily 
loaded than the non-counseled, although they decreased rather than in- 
creased their program. 

As can be seen, the most significant difference, much beyond the 
1% level, is between the Counseled No-Change and Non-Counseled No- 
Change Groups. The difference between the Counseled Change and Non- 
Counseled Change Groups is significant at the 7% level. 
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For purposes of comparison, groups were combined to increase their 
size. In Table 1 it is evident that both groups which did not change 
major have a higher initial grade point average than the groups which 
changed. Combining them, thus giving two groups with an N of 70 
each, it is found that the No-Change Group (III and IV) has an initial 
grade point average of 1.46 ¢ .11, while the Change Group (II and I) has 
an initial average of 1.15 0.11. The difference of .31 has a critical ratio 
of 17.51, significant at the 1% level, indicating that the students who 
changed majors, whether counseled or not, had a significantly lower grade 
point average initially than those who did not change. This may indicate 
that this group whose grades were below their potentialities endeavored 
to improve them by a change of major or by seeking counseling. Of 
course, half of the No-Change Group also sought counseling. This may 
also indicate that better grades are achieved by those in appropriate 
fields of study. 


Table 3 


Change in Average Study Load (units) 








Ist 2nd 
Semester Semester Change 





. Counseled Change 13.62 14.31 +.69 
. Non-Counseled Change 14.23 14.00 — .23 
. Counseled No-Change 14.77 14.70 —.07 
’, Non-Counseled No-Change 14.23 14.50 +.27 





Table 4 


Critical Ratio of Differences of Changes Between Groups 











II. Non-Couns. III. Couns. IV. Non-Couns. 
Change No-Change No-Change 





I. Counseled Change 1.84 3.17 5.22 
II. Non-Counseled Change 1.20 2.89 
III. Counseled No-Change 5.55 





By combining the groups according to whether counseled or not, we 
find the Counseled Group (I and III) makes an increase in grade point 
average of .30, o .62, while the Non-Counseled Group (II and IV) makes 
an increase of .09 in grade points, o .47, a difference of .21 grade points in 
favor of the counseled groups. This difference has a Critical Ratio of 
2.31, significant at the 2% level. 
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Discussion 


As has been mentioned, the groups in this study were necessarily 
small, and therefore the conclusions that may be drawn are limited. 
It is hoped that when possible this study will be repeated with a larger 
sample. Since the methodology of this study has afforded suggestive 
results it is also to be hoped that this study will be repeated with other 
populations. 

We feel that the use of the criterion of grades is warranted in view of 
their importance for the survival of the student, his future opportunities 
for professional training, or for employment. We fully recognize, how- 
ever, how few aspects of ‘‘counseling effectiveness” such a criterion may 
evaluate. It is a task of the future to develop criteria for these less 
objective areas. When this problem has been mastered, it may be found 
that additional criteria will show more clearly the value of vocational 
and educational counseling. 

It is particularly apparent in this study that most students with 
academic deficiencies eradicated these deficiencies in the semester fol- 
lowing counseling, regardless of whether they changed their major. 
These data imply the social value of counseling in the salvaging of 
deficient students. However, it is no less clear that students making 
satisfactory grades can benefit from counseling. 

From the standpoint of evaluating counseling, we cannot, of course, 
generalize beyond the particular type of counseling under study. With- 
out careful, intensive vocational and educational counseling on an indi- 
vidual basis, with concern for the individual as a whole, results may differ. 

The improvement of grades by counseled students might be attributed 
to whatever factors differentiated those students seeking counseling 
from those who do not, a possibility considered in similar studies. With 
the inclusion of a group who changed majors without counseling, we feel 
that we have effected some equalization of whatever factors may lead 
students to take action of one kind or another to improve their situation. 
As shown in this study the improvement made by those who were coun- 
seled and changed major is considerably greater than that made by those 
who changed major independently. We cannot, of course, demonstrate 
conclusively that the scholastic improvement of the counseled groups 
as compared with the non-counseled groups was due to the counseling, 
since counseling itself is a complex of many variables. Such a possibility, 
must, however, be considered. The other studies, with similar coun- 
seling, have in general shown similar results. 


Summary 


1. A group of male veteran undergraduate students who changed their 
majors as a result of counseling improved their grade point average 
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significantly, despite an increase in number of units carried. Im- 
provement is even more marked if only major subject course grades are 
considered. 

2. The difference in grade point average improvement between two 
groups of male veteran undergraduate students who did not change their 
majors, one of which received counseling, was significantly in favor of the 
counseled groups, at better than the 1% level. 

3. When non-counseled and counseled groups were compared, the 
counseled students increased their grade point average by an amount 
more than the non-counseled students with a significance at the 2% level. 


Received October 1, 1948. 
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A Follow-up Study of Social Guidance at the College Level * 
Margaret Glockler Aldrich 


University of Missouri 


In 1940, the author published a research report entitled ““An Explora- 
tory Study of Social Guidance at the College Level.’”’! Early in 1948 
it was decided to check the available records of the girls who as college 
freshmen (1939-1940) were the subjects for the experiment. It was felt 
that eight years would be sufficient for them to have completed their 
undergraduate careers. In checking the records it was found that only 
one girl was still in residence at the University of Minnesota in 1947-1948. 
She returned to school under the G. I. Bill and her transcript looks as 
if she might soon fulfill the requirements for graduation. 

The original study was an attempt to compare two groups of freshmen 
girls who were all cases at the University Testing Bureau.? All of the 
girls went through the usual testing and counseling procedures of the 
Bureau. The experimental group received additional guidance in the 
social adujstment area and were directed toward participation in extra- 
curricular activities. It consisted of at least one added interview with 
each girl in the experimental group stressing her social and activity life. 
In most cases this resulted in a definite contact with one or more of the 
activities in which the girl expressed an interest. The organizations had 
been contacted concerning the general need for good cooperation between 
various campus agencies. They did not know, however, that these girls 
were in any way “special cases.’ It seems safe to assume that the girls 
in the experimental group were also exposed to the usual social and extra- 

curricular program in the same way that all freshmen girls are exposed. 


* This follow-up, made while the writer served in the Student Counseling Bureau, 
Office of the Dean of Students, University of Minnesota, was made possible through the 
cooperation of many individuals and agencies. Mention should be made of the follow- 
ing: Dr. E. G. Williamson, Dean of Students; Mr. John Foley, head of the Disciplinary 
Committee of the Office of the Dean of Students who suggested the follow-up study; 
Dr. Ralph Berdie, Director of the Student Counseling Bureau; Mr. James Borreson, 
Director of the Student Activities Bureau; and Dr. Robert Hinckley, head of the Mental 
Hygiene Clinic of the Student Health Service. Special thanks are due the author’s 
major adviser, Professor Donald G. Paterson, who suggested and guided the 1940 study 
and encouraged this follow-up. 

1 Educational and Psychological Measurement. Vol. II, No. 2, April, 1942, pp. 209- 
216. 

2 UTB is now called Student Counseling Bureau. 
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The control group had no added counseling but, of course, the girls 
in the group were free to make use of the University social and extra- 
curricular program. At the end of the school year both groups were 
retested on several personality scales and given a questionnaire. The 
conclusion reached at that time was: ‘‘All of these findings combine to 
indicate that, from this small sample, social guidance and directed partici- 
pation in extra-curricular activities improve the ‘social adjustment’ of 
Freshmen girls as measured by personality scales and a questionnaire. 
Not only do the girls in the experimental group make greater mean 
gains, but they feel that they have more friends, participate in more 
activities, and are less critical of the social program than the control 
group. A treatment that makes people feel better satisfied with their 
social life is certainly worthy of further consideration.” * 

It should be pointed out that the study involved a very small sample, 
31 experimental and 28 control subjects. Also, both groups were origi- 
nally selected from the lower end of the distributions for freshmen girls 
on the Minnesota Inventory of Social Attitudes—Forms P and B and 
group activities in high school. They did not differ significantly, how- 
ever, from the rest of the freshmen girls in mean ACE Psychological 
Examination score or in mean Cooperative English Test score. The 
experimental and control group were remarkably alike at the original 
testing on six objective measures (ACE, Coop. Eng., Social Beh., Social 
Pref., Rundquist-Sletto Inferiority Scale, and Bell Adjustment Inventory 
—social) and on high schoo! group and individual activities. The con- 
trol group was somewhat higher in high school scholarship rank. 

Since the study covered only a brief period of time (9 to 12 months), 
it seemed worth while to re-study the groups after a period of eight years. 
This re-evaluation would indicate whether or not the gains revealed in 
the original study were ephemeral or were permanent. 

The follow-up was confined to a check of the records kept by various 
campus agencies. The following agencies were contacted: Student Coun- 
seling Bureau; Student Activities Bureau; Bureau of Admissions and 
Records; Disciplinary Committee; Mental Hygiene Clinic of the Student’s 
Health Service; and the Alumni Association. 

In making the follow-up study, a new card was made for each girl 
with no indication of whether the girl belonged to the experimental or 
control group. All lists were sent to the agencies undesignated. This 
is of importance since several of the recordings involve judgments. When 
all of the data were collected the experimental and control groups were 
separated for analysis. 


3 Op. cit., p. 216. 
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Results 


Student Counseling Bureau Records. The Bureau records consist of a 
folder for each girl with her test results and a record dictated by the 
counselors of all counseling contacts in the Bureau. Table 1 summarizes 
the quantitative information and indicates that the experimental group 
had made, on the average, slightly more contacts over a slightly longer 
period of time.‘ The mean number of counseling contacts for both 
groups is considerably higher than the Bureau average of about two for 
these years. 


Table 1 


Student Counseling Bureau Contacts 








Mean Mean No. of Mean Duration 
No. of Contacts After of Contacts 
Group Contacts Retesting in Mos. 





Control 

N = 24* 4.58 1.33 14.1 
Experimental 

N = 3l 5.74 1.65 15.5 





* Four of the 28 girls in the control group were counseled by a counselor for the 
College of Science, Literature, and Arts. The folder of test results was kept by the 
U. T. B., but the interview records were kept in the S. L. A. office. Since these records 
are destroyed after five years, these girls had to be omitted from this part of the study. 


Student Activities Bureau. In the years 1936-1946 the Student Acti- 
vities Bureau kept records of the extra-curricular activities of all students 
in the University. The records were tabulated each quarter by the 
Bureau staff from their membership, committee, and officer lists and 
from publicity in the college newspaper. The director ofthe Bureau 
feels that the records are not too accurate and err in the direction of 
omitting activities. 

The information from these cards has been summarized in Table 2 as 
mean number of activities, committees, and offices per year for the 
number of years the particular girl was in school. It must be emphasized 
that these are approximations and if anything underestimates. Never- 
theless the results indicate that the girls in the experimental“group 
participated in more activities, served on more committees, and held 
more offices than those in the control group. 


‘Statistical tests of significance of differences have not been computed because of 
the small N’s and a belief that the chief value of the original study, and the present 
follow-up study, is to be found in the control group method of investigating the area of 
“social guidance.” 
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Table 2 
Student Activity Bureau Record 








Mean No. of Mean No. of Mean No. of 
Activities Committees Offices 
Group Per Year Per Year Per Year 





Control 

N = 26* 62 .03 .08 
Experimental 

N = 30° .96 .26 .27 





* There were no cards in the files for two girls in the control group and one in the 
experimental group. 


Gopher Record. A second source of activity record is the yearbook 
of the senior class, the Gopher. Each senior records his own activities 
for his years in college. The results from this source are very incomplete. 
It is only available for the girls who actually graduated and many of 
them did not have their picture and activity record included in the 
Gopher. 

Table 3 
Activity Record from Gopher (College Yearbook) 








Mean No. of Mean No. of Mean No. of 
Activities Committees Offices 
Group Listed Listed Listed 





Control 

N=8 2.5 .25 .38 
Experimental 

N = 10 4.6 .60 1.00 





Table 3 is based on the records of 18 girls who were included in the 
Gopher (a little over 50 per cent of those who graduated). The years 
’41, ’42, ’43, ’44, and ’46 were checked since these are the years listed 
for the graduates on the official transcripts. This rather skimpy evidence 
again points in the direction of greater activity for the experimental group. 

These data can be compared with the Activity Bureau record, Table 
2, by dividing each mean by 4 to get the mean per year. The results 
are strikingly similar as shown in Table 4. 

These results raise an interesting question which might be investi- 
gated further. There is a common idea that students tend to over- 
estimate their activity record for publication. This small sample did 
not do this, particularly if we remember that there is some evidence that 
the SAB activity record is an underestimate. 
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It should be added that the records of Mortar Board (Senior Women’s 
Honorary) were also checked for these years. Mortar Board picks its 
members from the entire junior class on the basis of scholarship, leader- 
ship, and service. Through the years at Minnesota this group has 
tended to include the leaders in extra-curricular activities if their grades 
were up to a certain fixed level. Two girls from this study were elected 
to the 1943 chapter of Mortar Board. They were both members of the 
experimental group. 


Table 4 
SAB Activity and Gopher Record Compared 





Mean No. of Mean No. of Mean No. of 
Activities Committees Offices 
Group Per Year Per Year Per Year 





Act. Goph. Act. Goph. Act. Goph. 
Control 62 83 : .08 .09 
Experimental .96 1.15 : .27 .25 








Bureau of Admissions and Records. Data from the Bureau of Ad- 
missions and Records consisted of a transcript for each girl. Table 5 


summarizes these data. 

The academic records of the two groups are similar although the 
control group had a somewhat higher average. It might be well to recall 
that they also had a slightly better high school academic record. 


Table 5 


Information from Official Transcript 








Per Cent Mean Mean No. of 
Group Graduated ka Ky Quarters at Minn. 





Control 

N = 15 54 1.51 8.86 
Experimental 

N = 18** 58 1.12 8.87 





*H.P.R. = honor point ratio = honor points/credits, where for each credit of A, 3, 
B, 2, C, 1, D, 0, and F, —1 honor points are given. These were calculated only from 
University of Minnesota grades and for undergraduate work. 

** This figure omits two A.A. (Associate of Arts) degrees: a two year degree granted 
by the General College. There were 5 girls who did some work in General College. 
Their records are not included in Column 2 since General College grades are not directly 
comparable to grades in other colleges, 
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Disciplinary Committee Records. The list of girls was sent to the 
head of the Disciplinary Committee of the university. He reported 
that none of the 59 names was recorded in the files of that committee. 

Mental Hygiene Clinic. The list of names was also sent to the head 
of the Mental Hygiene Clinic in the Students’ Health Service. He had 
the names checked against the clinic records. Six girls had contacted 
the Clinic, three from the Control Group and three from the Experi- 
mental. In each case he made an estimate of severity of diagnosis with 
the result that those from the Control Group were labelled ‘severe’ 
whereas none from the Experimental Group were so designated. 

Although the psychiatrist reports that about 5 per cent of the Uni- 
versity population would like to make contact with the Clinic, he esti- 
mates that through the years the Clinic has had facilities for only about 3 
per cent. This is much lower than the 10 per cent of both the experi- 
mental and control group who went to the Clinic. This might be ex- 
plained by the original selection of the groups from the lower end of 
the distributions on the personality scales. One might hypothesize that 
the social guidance did little to prevent the development of problems 
requiring mental hygiene but that these problems were less severe for 
the girls who had the earlier specialized help. Obviously this is little 
more than a hunch. 


Table 6 
Per Cent of Married Graduates * 








N Married N Not Married 





Control 

N=14 10 71% 4 29% 
Experimental 

N = 18 11 61% 7 39% 








*Includes A.A. degrees. Total N = 32. The alumni office records, however, did 
not list as graduates three girls whose transcripts indicate that they did receive degrees. 


Alumni Association Records. The records of the Minnesota Alumni 
Association are kept only for the students who actually graduate from 
the University of Minnesota. For each graduate there is a fairly com- 
plete record of address and married name. At the time of this study 
the latter record is summarized in Table 6. Clearly from these incom- 
plete records a higher percentage of the control group married. If we 
consider marriage an indication of social adjustment, the control group 
(at least those who graduated) is better adjusted. This is the only 
scrap of evidence in favor of the control group. 
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Summary 


This follow-up of social guidance can be summarized in three sections. 


1. Those who received special guidance with social problems exceeded 
the control group in: (a) the number of contacts with the Student Coun- 
seling Bureau; (b) the mean number of college activities, committees, 
and offices; (c) the percentage graduating from the University of Minne- 
sota; and (d) a less severe diagnosis for those who contacted the Mental 
Hygiene Clinic. 

2. The groups were much alike in: (a) the mean number of months 
over which the contacts with the Student Counseling Bureau were made; 
(b) the number of quarters in residence at the University of Minnesota; 
and (c) the number of girls who contacted the Mental Hygiene Clinic. 

3. The control group was slightly higher than the experimental group 
in: (a) mean honor point ratio; and (b) the percentage of the graduates 
listed in the Alumni Bureau files as married. 


The small numbers in both groups make more detailed statistical 
analysis of questionable value. From the data available, however, there 
is an indication that the gains originally reported for the socially guided 
group continued throughout their college residence. Again, the tentative 
conclusion of the original study can be re-emphasized with the caution 
mentioned in the last sentence of that study, “the problem was, however, 
essentially an investigation of a method and as such the results should 
be emphasized only as a justification for the further use of the method.” 


Received October 22, 1948. 





Memory in Radio News Listening 


Thomas W. Harrell, Donald E. Brown, and Wilbur Schramm 


University of [Uinois 


Questions of practical importance have arisen in the field of radio 
involving the extent to which a listener is able to remember what he 
hears on a newscast. The newscaster is anxious to know how tightly 
he can “pack” his newscast—how many stories he can put into a given 
time without giving his audience more than they can absorb. Beyond 
that, he wants to know the effect on memory of repetition within the 
newscast. He is interested in what kinds of subject matter and what 
treatments of those are remembered better than others. He would like 
to know whether his audience listens for “index words,’’ whether it re- 
members names and details, whether it remembers items far removed in 
locale as well as it remembers items originating nearby. Finally, he 
would like to know, if possible, what kinds of items discriminate least 
between good memories and poor memories, and therefore, so far as the 
factor of memory is concerned, are mass materials for a mass medium. 

In a situation wherein the average adult American listens more than 
three hours a day to the radio, between 10 and 25 per cent of this time to 
radio news, these questions become of social as well as professional im- 
portance. The study reported here was undertaken in an attempt to 
provide experimental data in an area where the hunch and the thumb 
have ruled. 


Method 


Two entirely different news broadcasts each containing 20 stories 
were written by an experienced news editor. These were designated as 
broadcasts IA and IIA. These 20 stories were reduced in size but with 
care being taken not to omit any important detail, to permit the addition 
of 10 more stories making a total of 30 stories in newscasts IB and IIB. 
(All newscasts actually ran 1214 minutes in order to make them the same 
length as are most commercial casts on the radio. It was not thought 
necessary for the purpose of this study to insert commercials.) 

For the next series of newscasts which were written in a highly com- 
pressed style each of these 30 stories was further reduced in length which 
provided time for the addition of 10 new stories, making a total of 40 
(newscasts IC and IIC). 
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All six newscasts were transcribed. An experienced announcer read 
the casts, transcribing only two per day to avoid staleness. Each cast 
was transcribed to a platter, tape, and wire. It proved more convenient 
to use the tape transcriptions except for one presentation of a platter 
recording. 

The casts were fictitious but plausible. Real happenings were not 
presented because there would then have been some persons who were 
already more familiar than others with the event. That the casts were 
highly realistic was suggested by the questions of some subjects, who, 
even though assured of the contrary, would inquire whether “there was 
anything to” one or more of the stories. 

Memory tests were constructed for each cast. Four alternative 
multiple choice questions with single best answer were used. The 
scoring formula, Right — 144 Wrongs, was used to correct for chance 
successes. One question was asked on each story so there were 40 
questions each on casts IC and IIC, 30 on casts IB and IIB, and 20 on 
casts IA and IIA. All the questions on A casts were repeated in B casts, 
and all questions in B casts were repeated in C casts. The aim was to 
make the questions central to the story and as easy as possible while at 
the same time assuring discrimination from guessing. 

Two casts each were presented to ten groups of subjects. Each I 
cast was presented to a group which also heard a II cast of a different 
number of stories. The order of presentation was reversed from one 
session to another because of the possibility of a practice or fatigue effect. 

The method is recognized as being not true to life. In the first place 
the casts were fictitious. In the second place the subjects were as- 
sembled and had fewer distractions than do radio listeners ordinarily. 
It is expected that the experimental conditions would yield a maximum 
of what couid be remembered in real life. It is believed that when 
listening to news on the radio the majority of listeners do not give as 
good attention as in the experimental setting. On the other hand, one 
could conjecture that because of the fictitious nature of the news it would 
not be attended quite so well as if it were real, so there would be some 
compensating effect to the extraordinary attention. Some thought was 
given to using real news casts and actual listeners, but the expense of 
such a study was found to be prohibitive. 

Each group of listeners was told the purpose of the study, that there 
would be a memory test after each cast, and that there would be a pref- 
erence question after both casts. 

An effort was made to choose as subjects adults similar in education 
to the average of the American radio listening audience. The majority 
of subjects were enlisted men and women of the United States Air Force. 
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These Air Force enlisted man and women were members of the per- 
manent party of a base. Their average educational level was in the 
neighborhood of 10th grade. Their standard scores on the Army General 
Classification Test ranged from approximately 90-115, which is practi- 
cally the range of the complete adult population of military age. 

The subjects also included two groups of nonacademic employees 
and three groups of students at the University of Illinois. One of the 
groups of nonacademic employees was a group of groundsmen whose 
educational level was similar to that of the Air Force subjects. The 
second group of nonacademic subjects were supervisors whose educa- 
tional level ranged from high school graduation to college graduation. 
The student subjects were undergraduates. The subjects were some- 
what above the average of the American public in education and con- 
sequently above the average of the radio listening audience, but since 
over half of the subjects were within the average range in education, they 
were regarded as satisfactory for the experiment. 


Results: I. Memory and the Number of Stories 


A reasonable hypothesis is that if a listener is presented a progressively 
increasing number of items within a fixed period of time, he will remember 
a progressively smaller proportion of them. The results bear this out, 
as Table 1 shows. 

Table 1 


Memory for Broadcasts 








Mean 
Mean Cais Raw 
Test N % Came % Score 





A (19-20 items) * 320 54.5 1.20 21.55 10.9 
B (29-30) * 264 49.3 1.12 18.30 14.8 
C (40) 308 45.9 1.09 18.50 18.3 





* One item had to be omitted from scoring in two sets of the tests. 


Table 2 shows the statistical significance of these differences. 

A further test of these figures is given in Table 3, which shows positive 
and significant correlations between each pair of test scores.! This 
indicates that listeners who were high on one test tended also to be high 
on the other. Therefore, the listeners must have been attending, and 
the tests were measuring the same thing, whatever it was they were 
measuring. 


! The variations in size of the coefficients do not make sense to the investigators, and 
are presumably due to chance. 
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It seems to be a tenable hypothesis, then, that a listener remembers 
a smaller proportion of items in a fixed-time newscast if the number of 
items is increased from 20 to 30 to 40. The question then follows: where 
is the point of insufficient return? Where does the memory curve fall 
off so sharply that the newscaster may conclude he has overpacked his 
newscast? 


Table 2 
Probability that Memory Differences May Be Due to Chance 








Differences Probability 





A and B (19-20 and 30) .0012 
B and C (29-30 and 40) .013 
A and C (19-20 and 40) .0000 





Table 3 


Coefficients of Correlation Between Scores on Each Pair of Tests 








Tests r N 





IA-IIB .63 77 
IA-IIC Al 127 
ITA-IB 42 61 
ITA-IC 43 55 
IB-IIC 51 80 
IIB-IC -68 46 





While the differences shown in Table 1 are significant, they are 
nevertheless slight. In fact, they are so slight that a listener actually 
remembers more items from a 30-item cast than from a 20-item cast, 
from a 40 than from a 30. In a 20 item newscast 11 stories are re- 
membered, in a 30 story newscast 15 stories are remembered, and 18 
stories are remembered in a 40 story newscast. It must be concluded 
therefore, that there is nothing in this evidence, so far as the factor of 
memory goes, to lead a newscaster to set an arbitrary limit below 40 
items in a 1214 minute newscast if his material justifies that many items. 
The factor of audience preference, however, bears strongly on this point, 
as the next section of this report will show. 

For what it is worth, these figures suggest that a listener remembers 
a few minutes after a newscast has been heard, about half the items in 
the newscast. This suggests several related questions, such as the kinds 
of material that are remembered best, the effect of repetition, and the 
kinds of cues that arouse the best learning response in news listening 
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situation. These questions are discussed in sections III, IV, and V of 
this report. 


Results: II. Preference and the Number of Stories 
A preference question was asked at the end of each pair of casts. 
The results are shown in Table 4. 


Table 4 


Preferences for Broadcasts 











Probability that 
N % True % = 50 





A (20) 49 52 
B (30) 45 48 3483 


A (20) 135 74 
C (40) 48 26 .0000 


B (30) 83 66 
C (40) 43 34 0001 





These figures indicate that the broadcast with 40 stories was clearly 
liked less than those of 20 and 30 stories. Approximately three out of 
four people preferred the 20-item casts to the 40-item casts. Almost 
exactly two out of three persons preferred the 30-item casts to the 40- 
item casts. The slight preference for 20 items as compared with 30 
is statistically insignificant. These figures suggest that the memory 
effort involved in listening attentively to a 40-item newscast, though 
quite possible for the average listener, is not popular; and this provides 
good reason for the newscaster to limit his number of items to 30, perhaps 
still better to 20. 


Results: III. Memory and Repetition of News Facts 


Sixteen questions were so designed as to repeat facts, to be tested, 
oftener in one cast than in another. When test results on these questions 
are compared, there is no significant trend discernible, as Table 5 shows. 

On the basis of these results, the hypothesis can be advanced that repe- 
tition of facts in a newscast has no significant effect on audience memory 
of those facts. It may be, of course, that uncontrolled variables entered 
into this result. The number of stories in newscast may have some 
influence on the effectiveness of repetition. It would seem, however, 
that whatever influence is present here might work in the direction of 
making repetition appear to be more important than it really is. This 
is true because there is more repetition in the casts with fewer stories. 
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Table 5 


Per Cent of Listeners Answering Question Correctly, Compared with 
Number of Times Answer Was Repeated in Cast 








Cast IA Cast IB Cast IC 
Times Rep. % Times Rep. Times Rep. 





3 
2 


1 





Cast IIA Cast IIB Cast IIC 





90 80 
64 
62 54 
68 49 
35 
57 55 


44 35 
91 89 97 





It has been shown that there is a slight tendency for memory to be better 
for any single story in a cast with 20 stories as compared to the cast with 
40 stories. Since in spite of this the statistical results show repetition 
to be of slight if any importance there is all the more reason to doubt the 
effectiveness of this kind of repetition. It must be remembered, of 
course, that this was not overt or enforced repetition; it was not done in 
the jangling fashion of ““LS/MFT” or even in the style, ‘‘T’ll repeat that 
name again.” Those signposts may make repetition more effective in 
creating a response that leads to memory. Furthermore, it may be that 
repetition is more effective in other repetitive situations—for example, 
when a story is heard on more than one newscast. 


Results: IV. Memory and Subject Matter 


The questions were divided by subject matter, and test scores com- 
pared on that basis. This is not an artificial division, inasmuch as most 
newscasts are compartmentalized by some kind of subject matter dis- 
tinctions. The results are shown in Table 6. 
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Because of the small number of human interest questions, the dif- 
ference between that score and others is not statistically significant. 
Between the mean per cents right on spectacular events and public 
affairs, the difference is significant at the 5% level; between public affairs 
and name items, at the 1% level.? It appears, then, that name items are 
hardest to remember; that public affairs items are harder to remember 
than stories of fires, windstorms, wrecks, murders, lynchings, and ‘other 
spectacular events; and further tests may show that human‘interest items 
are easiest of all to remember. 


Table 6 


Average Scores on Questions Classified by Content 








No. of 
Stories Mean % 





Human Interest 11 85 
Spectacular Events 46 72 
Public Affairs 36 63 
Name Items 79 53 








Results: V. Memory and Mass Audiences 


One of the sets of test was analyzed on the basis of how well each 
question discriminated between persons who did well on their two tests, 
and therefore may be supposed to have good memories, and persons who 
did poorly on their two tests, and therefore may be supposed to have less 
good memories. In order to do this, each question was ranked according 
to the difference between the number of participants below median and 
the number above median. When this was done, the middle half of the 
questions was discarded and attention focussed on the highest and lowest 
quartiles. It was assumed that the top quartile contains questions which 
most clearly show the difference between the best and the poorest 
memories; therefore, that the material being tested in this quartile is 
material which is more difficult and less well adapted to mass audiences. 
It was assumed also that the lowest quartile contains questions which 
least clearly show the difference between best and poorest memories; 
and therefore, that the material being tested in this quartile is least 
difficult and best adapted to mass audiences. The material in the two 
quartiles was then analyzed both in terms of subject matter and of the 
approach to that subject matter used in framing that question. Table 7 
gives part of that analysis. 

? Spectacular events and public affairs: t = 2.51, with 80 degrees of freedom. Public 
affairs and name items: t = 2.99 with 113 degrees of freedom. 
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On the basis of this analysis and further examination of items and 
questions, several hypotheses may be set forth. 

For one thing, public affairs appears to be the subject matter which 
chiefly discriminates between listeners who have good memories and 
listeners who do not; whereas materials involving crime, disaster, and 
human interest are remembered almost as well by poor memories as 
by good ones. 


Table 7 


Analysis of Items Which Proved to be Most and Least Discriminatory 
Between Good and Poor Memories 








Kind of Information 
Subject Matter Required by Question Locale 





(Highest quartile—most discriminatory) 


Public affairs Details of political action Foreign Russia 
Public affairs Details of violent action Foreign Palestine 
Public affairs Names plus details of Foreign Inter-American affairs 
political action 
Public affairs Details of economic National Taxes 
policy 
Public affairs Names of political National Politics 
classifications 
Disaster Details of accident National Airplane 
Disaster Details of accident Regional Mayor of nearby city 
Public affairs Details of political action Regional Methodist minister 
Human interest Names and cities : Regional State bar association 
Human interest Name of war Regional Civil War 


(Lowest quartile—least discriminatory) 


Public affairs Details of quotation National Forrestal—Alaska- 
Russia 

Disaster Details of cause of fire National Children—fire 

Disaster Details of cause of accident Regional Old man—teacher 

Public affairs Details of cause of strike Regional Strike—name of 
nearby town 

Crime Details of violent action Regional Negro—lynching— 
murder 

Crime Name of town Regional Escaped convict 

Human interest Name of person Regional State farmers union 

Human interest Details of prize won Regional Hollywood—Cinder- 
ella story 

Human interest Details of divorce action National Hollywood-name of 
well-known star 

Crime Details, nature of crime National American sailors 
assault 
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There is a slight indication that names may discriminate more than 
details, but the essential difference seems rather to be the kind of detail. 
Political detail seems to be more discriminatory than sensational detail. 
It may well be that such a combination as foregin names and political 
details, as in one of the upper quartile questions, may put the greatest 
challenge to listener’s memories. 

It appears that events far removed in locale are more likely to dis- 
criminate between good and poor memories than events near at hand. 
It will be noticed that there are no foreign stories in the lowest quartile, 
and that one of the stories there classified as ‘“‘national’’ is about Holly- 
wood, a locale which mass communications have brought next door to 
all America. 

One theory of radio news listening is that the listener puts into effect 
his own selective mechanism to parallel the newspaper reader’s use of 
headlines or the magazine reader’s use of the table of contents. That is, 
it is conjectured that the radio audience listens at a rather low level of 
attentiveness until he hears an “index”? word or phrase which triggers a 
response, raises the level of attention, and causes perception to take 
place. With this theory in mind, it is interesting to look at the column 
of ‘cues’ in Table VII. These are the words which seem to “‘stick out”’ 
from the stories, the ones which might serve as index words or cues to 
create a response in case that process is the one in effect. It will be 
noticed that the stories in the lowest quartile have a high incidence of 
rather sensational or familiar cues—Hollywood, Danny Kaye, children 
burning, old age, strike, lynching, murder, escaped convict, towns nearby. 
The stories in the highest quartile, on the other hand, have rather more 
sophisticated cue words—the Inter-American situation, politics, taxes, 
Palestine. 

As far as the memory factor goes, then, it would seem to be possible to 
hypothesize a formula for a newcaster’s approach to the lowest common 
denominator, and therefore to a mass audience. That formula would 
be about the same as the one used by many newspapers which have 
reached mass circulations—sensation, crime, disaster, human interest; 
public affairs subordinated or treated in a sensational manner; an em- 
phasis on nearby places and familiar names; and a plentiful sprinkling 
of interest-attracting words, names, and phrases. 

A word of caution may be unnecessary here. Nevertheless, the in- 
vestigators wish to make it clear that they do not consider these facts to 
be reason for subordinating public affairs in newscasts, or for sensation- 
alizing and infantizing all copy on the grounds that such is the least 
common denominator of the mass audience and radio is a mass medium. 
That is no more required of a newscast than it is required of all news- 
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papers. Nor is it the import of these results. Rather, these results 
point to further study of the use of public affairs news on the air—how it 
may be made useful and effective for the part of the audience which 
needs it, without loss of either truth or dignity; the extent and connection 
to which names and details can be used when important for the audience’s 
information; and the boundaries, if any, between kinds of material which 
can best be presented to the ear or to the eye. Questions like these will 
yield to experimental approach, and radio will grow in its public service 
if the results of such experiments can be incorporated into practice. 


Summary 


1. An audience remembers a proportionately smaller percentage of 
the items in a 15-minute newscast as the number of items is increased 
from 20 to 30 to 40. This difference, however, is slight—so slight that 
actually more items are remembered from the 30-item newscast than from 
the 20, more from the 40 than from the 30. 

2. An audience has a decided preference, however, for newscast with 
20 or 30 items over one with 40 items. 

3. Repetition of facts within a newscast has not been shown to have 
a significant effect on audience memory. 

4. Human interest and spectacular stories of crime and disaster are 
remembered better than are stories of public affairs. 

5. Insofar as the factor tested is concerned, the appeal to a mass au- 
dience by radio news is similar to the appeal of certain sensational news- 
papers which have reached mass audiences. Results of this study indi- 
cate that human interest and spectacular events are remembered by 
the mass audience, whereas such serious subject matter as public affairs 
is remembered less well by the part of the population which is not gifted 
with good memories. Nearby events are more likely to be remembered 
by the mass audience than events of distant origin. Details and names 
do not make for mass remembrance, and details of political events and 
foreign names in a public affairs story are especially hard to remember. 
“Index words” of a sensational or familiar nature are also helpful in 
penetrating the memories of the mass audience. 


Received October 4, 1948. 





Tables for Use with the Flesch Readability Formulas 


James N. Farr and James J. Jenkins 


University of Minnesota 


Increased emphasis is being given to measurements of the readability 
of communications in many fields. A promising approach which is 
being widely used and studied is that set forth by Flesch '? which involves 
the use of syllable counts, sentence lengths, percentage of personal words 
and percentage of personal sentences to yield two indexes. One index is 
‘Reading Ease’’ or level of difficulty and the other is ‘Human Interest.” 

In order to facilitate the use of these formulas, the writers have tabled 
the values for them. The tables are simple to use. Table 1, “Reading 
Ease,’’ is entered vertically by average sentence length and horizontally 
by the number of syllables per one hundred words. The index figure is 
given at the point of intersection of the row and column entries. For 
example, if a sample of one hundred words contains 133 syllables and has 
an average sentence length of 25 words, the “Reading Ease’ index 
equals 69. This index number may then be interpreted directly in 
terms of difficulty by Flesch’s table.” 

In like manner the “Human Interest’? table (Table 2) is entered 
vertically by percentage of personal sentences and horizontally by the 
percentage of personal words to obtain that index. For example, if a 
sample has thirteen personal words per one hundred words and ten per- 
cent of the sentences are personal, the ‘‘Human Interest” index is equal 
to 50. This figure may be directly interpreted in terms of interest by 
Flesch’s table.’ 

Several checks were made to insure the accuracy of the tables. The 
outer edge indexes were computed separately by the writers. One writer 
obtained the tabled values by use of the subtractive constant for columns; 
the other used the subtractive constant for rows. Both writers checked 
the work by use of the subtractive constant for selected diagonals. 

Since the formulas are both straight-line functions, simple abacs may 
be easily constructed for use in situations where only approximations 
are needed. 


1 Flesch, R. The art of plain talk. New York: Harper and Brothers, 1946. 
? Flesch, R. A new readability yardstick. J. appl. Psychol., 1948, 32, 221-233. 
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Table 2 


Flesch Human Interest Index Table 








Percentage of Personal Words 
01323 4 6 6 7 8 9 10 11 12 13 
0 0O 04 O7 11 15 18 22 25 29 33 36 40 44 47 
2 04 O8 12 15 19 22 26 30 33 37 41 44 48 
4 05 09 12 16 19 23 27 30 34 38 41 45 49 
6 02 06 O09 13 16 20 24 27 31 35 38 42 46 49 
8 03 06 10 13 17 21 24 28 32 35 39 42 46 50 
10 07 10 14 18 21 25 29 32 36 39 43 47 50 
12 04 O07 11 15 18 26 29 33 36 40 44 47 51 
14 04 O8 12 15 19 23 26 30 33 37 41 44 48 
16 05 09 12 16 20 23 27 30 34 38 41 45 49 52 
18 06 09 13 17 20 27 31 35 38 42 46 49 53 
20 06 10 14 17 21 24 28 32 35 39 43 46 50 54 
22 O7 11 #14 18 21 25 29 32 36 40 43 47 51 54 
24 08 11 15 18 22 29 33 37 40 44 48 51 55 
26 O8 12 15 19 23 30 34 37 41 45 48 52 55 
28 09 12 16 20 23 27 31 34 38 42 45 49 52 
30 09 13 17 20 24 28 31 35 39 42 46 49 53 57 


32 10 14 17 21 25 28 32 35 39 43 46 50 54 57 
34 11 14 18 22 25 32 36 40 43 47 51 54 58 
36 11 15 19 22 26 29 37 40 44 48 51 55 59 
38 12 16 19 23 26 30 37 41 45 48 52 56 59 
40 13 16 20 23 27 31 34 38 42 45 49 53 56 60 
42 13 17 20 24 28 31 35 39 42 46 50 53 57 60 
44 14 17 21 25 28 32 39 43 47 50 54 57 61 
46 14 18 22 25 29 33 40 44 47 51 54 58 

48 15 19 22 26 30 33 37 41 44 48 51 55 59 62 
50 16 19 23 27 30 34 41 45 48 52 56 59 63 


Percentage of Personal Sentences 


60 19 23 26 30 33 37 41 44 48 52 59 62 66 

70 22 26°29 33 37 40 44 47 51 55 58 62 

80 25 29 32 36 40 43 47 51 54 58 62 65 69 72 

90 28 32 36 39 43 46 54 57 61 65 68 72 76 
100 31 35 39 42 46 50 53 57 60 64 68 71 75 79 





* X indicates 100 or over. 





Inasmuch as these tables permit rapid and accurate determination of 
the Flesch index values and eliminate virtually all calculations previously 
involved, it is hoped that more research on the applicability and utility 
of the formulas will be undertaken. 


Received February 26, 1949. 
Early publication. 





Book Reviews 


Bowler, Earl M., and Dawson, Frances Trigg. Counseling employees. 

New York: Prentice-Hall, 1948. Pp. xi+247. $4.00. 

The authors state that this book is an answer to the well founded 
desire of employee counselors for a handbook written by practical people 
in down to earth style. Many readers will disagree. 

Psychologists are not apt to be favorably impressed by a twenty-five 
degree merit rating scale, consideration of Cardall’s Practical Judgment 
Test as a personality test, statements such as “It is not unusual for a 
good counselor to be called Mr. Anthony,” and frequent use of generalities 
for which little if any experimental evidence is cited or available. In- 
dustrialists are not apt to agree that handicapped persons should be 
employed to prevent them from developing competitive companies, and 
that current job salaries are low. 

This reviewer does not believe that publication of the book will im- 
prove the theory or practice of counseling. 


C. E. Jurgensen 
Minneapolis Gas Company 


Kessler, Henry H., M.D. Rehabilitation of the physically handicapped 
New York: Columbia University Press, 1947. Pp. 251. $3.50. 
Associated with the New Jersey Rehabilitation Commission from 1919 

until 1941, at which time he entered the Navy to continue his rehabilita- 

tion activities, Dr. Henry H. Kessler has had a peculiar opportunity to 
participate in an integrated approach to the problem of seeing a person 
through from illness or injury to a job—in a word, rehabilitation. Re- 
habilitation of the Physically Handicapped is a general survey of the prob- 
lems encountered in and the services that constitute an adequate rehabili- 
tation program. The author presents his interpretation of the needs 
of the disabled and the many unsolved problems in rehabilitation as 
evidenced during twenty-eight years of active experience in this field. 

For a general treatment of vocational rehabilitation this publication has 

no equal. 

The boek is divided into four general sections. Part one describes 
the problems of the physically handicapped in general with special treat- 
ment of the crippled child, injured worker, disabled veteran and the 
chronic disabled. Social attitudes and legislation have in general crystal- 
lized around these groups of handicapped persons. After a critical re- 
view of the concept of physical fitness the author concludes that “physical 
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disability has no meaning except as it refers to what an individual does 
to solve his own problems and what private and public agencies will do 
for him in easing that burden.’’ Social prejudice is identified as one of 
the major problems confronting the handicapped. 

The second section contains a discussion of the services that form the 
basic structure of vocational rehabilitation, namely, physical restoration, 
vocational guidance, vocational training and selective placement. Voca- 
tional rehabilitation would be considerably enhanced if a majority of the 
medical profession were equally conversant with these fields. 

In part three Dr. Kessler describes rehabilitation in practice. Al- 
though the author’s role is that of an active orthopedic surgeon, his in- 
sight regarding the whole man is constant and he has the capacity to 
convey this perspective to the reader. This section includes discussion 
of the mentally and emotionally disabled, the orthopedic patient, the 
blind and the deaf and the medical and surgical invalids. 

The final section includes a cursory review of the legislative and 
administrative organization of a few of the existing programs for the 
handicapped. The final chapter contains the author’s remedies for the 
problems pointed up so clearly throughout the book. Inadequate re- 
habilitation is due primarily to ‘‘the lack of public and professional 
knowledge of their possibilities (handicapped) and because of the igno- 
rance of facilities that are already available to them.” His major pro- 
posal is a uniform, compulsory, lifetime health record in the hands of 
state departments of health which would require annual reports from the 
individual or his physician and would urge him ‘to have his defects 
corrected by his private physician or by public facilities.” Disability 
pensions are advocated for those who cannot be rehabilitated. 


Donald H. Dabelstein 
Office of Vocational Rehabilitation 
Washington, D. C. 


Yoder, Dale. Personnel management and industrial relations. (3d ed.) 

New York: Prentice-Hall, Inc., 1948. Pp. xi+894. $5.00. 

For readers familiar with the two previous editions it is sufficient to say 
that this latest edition maintains the same high quality and thoroughness 
but is larger, longer, and brought up to date. Addition of materials and 
developments from the war and post-war period has expanded the dis- 
cussions on nearly all topics and particularly on selection, wage problems, 
stabilization of employment, personnel records, and the legal aspects of 
colleetive bargaining. When an established authority in the field sees 
fit to expand his treatment of certain areas it can be used as a rough 
indication of how the field itself is developing and what problems are 
receiving major attention. 
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The major characteristics of this edition are the same as those of the 
previous editions, viz., recognition of the importance of manpower in 
our industrial system, growth of personnel management as a profession, 
treatment of the historical development as well as the present status 
of the personnel function being discussed, consideration of personnel 
management problems from many aspects (economic, psychological, 
sociological, legal), inclusion of a chapter on statistics and reference to 
appropriate statistical procedures for each topic, emphasis upon research 
viewpoint and methods, thought provoking exercises and review ques- 
tions at the end of each chapter, and a thorough, wide coverage of the 
literature in the field. The literature coverage point should be em- 
phasized since, in addition to footnote references on nearly every page 
and to collateral readings at the end of each chapter, there is a list of 
31 research agencies, 56 journals, and 6 reporting services. The refer- 
ences are quite up to date, nearly all from 1940 on and extending into 
1948. The volume is worth its cost just as a bibliographic survey. 

A psychologist is in a rather peculiar spot when he reviews a book on 
personnel management. On the one hand, he likes to see that his psy- 
chological viewpoint and findings are permeating the field of manage- 
ment. On the other hand, he doesn’t want his findings so thoroughly 
treated as to eliminate the need for his courses on personnel and industrial 
psychology. This tendency puts the author of a book on personnel 
management in arelated dilemma. If he doesn’t give the psychologist his 
due, he is criticized; if he includes too much, the psychologist may say, 
“Stop, you’re in my bailiwick.” 

Yoder has handled this ticklish situation rather well. Of all texts 
on personnel management with which this reviewer is familiar, Yoder’s 
most clearly reveals the impact of psychological findings upon manage- 
ment principles and procedures, particularly in the areas of selection, 
training, morale and incentives. The importance of individual differ- 
ences, interpersonal relationships, and social psychology is stressed in 
his discussions. Many of his supporting references are from psychological 
journals. There is still need, however, for a complementary study of 
the psychological techniques per se and for a thorough treatment of 
the psychological studies merely referred to in the text. Yoder’s text 
thus will serve both to arouse management’s interest in industrial psy- 
chology and to help industrial psychologists understand how their contri- 
butions fit into the practical situation. 

A few minor criticisms could be made, such as the rather poor selection 
of special ability tests listed as representative of the field (225 n), giving 
the title of Shartle’s book as “Job Analysis” instead of ‘Occupational 
Information” (121 n), and stating that the G. I. Bill provides a maximum 
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of three instead of four years of training (253 n). The only major 
weakness apparent to this reviewer was the rather casual treatment of 
supervision. 
Albert S. Thompson 
Vanderbilt University 


Pigors, Paul, and Myers, Charles A. Personnel administration: a point 
of view and a method. New York: McGraw-Hill Book Co., Inc., 1947. 
Pp. ix+553. $4.50. 

This exposition of personnel administration is well organized. Sec- 
tion A (3 chapters) presents the broad function of the personnel adminis- 
trator, his place in management and the “personnel point of view,”’ based 
on a recognition of the worker’s need for both personal development 
and social relationships. Section B (3 chapters) presents a method of 
understanding and solving personnel problems, involving systematic 
consideration of four elements in the situation: (1) technical features, 
(2) the human element, (3) principles and policies, (4) the time factor. 
The use of this ‘“‘method of situational thinking’’ by the personnel admin- 
istrator as a staff officer is described in some detail and the integration 
of both “person-centered” and “policy-centered”’ approaches is empha- 
sized. A separate chapter describes the interview as a basic tool in 
investigation. 

Since the personnel point of view stresses individual worker and 
work-team adjustment and efficiency, Section C (3 chapters) discusses 
the personnel administrator’s function in diagnosing organizational sta- 
bility through studying employee morale. Indices discussed are pro- 
duction, absenteeism, accidents, turnover, and complaints and grievances. 
The remaining sections in Part I apply the personnel point of view and 
method of approach to the standard problems in personnel administra- 
tion. Twelve chapters deal successively with selection, training, em- 
ployee rating, transfer and promotion, discipline, wages, hours, employee 
services, etc. Chapter 22 summarizes the personnel point of view. 

The last third of the book (Part II) presents Case Illustrations sup- 
plementing the chapter discussions in Part I. Nineteen cases, ranging 
from 3 to 16 pages each, are given in considerable detail, including back- 
ground, interview or descriptive data, and interpolated discussion ques- 
tions and interpretation. Appendices include brief descriptions of 
the Western Electric Research Program and the Job Relations Training 
Program of the TWI and a summary of an Employee-Service Program. 
A Selected References section listing nearly 600 references grouped ac- 
cording to the chapters in Part I, an Index of Names referred to in Part 
I (but not in the Selected References), and a fairly detailed Subject 
Index conclude the volume. 
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This book represents a major contribution to the field and profession 
of personnel administration. ‘The authors have been able to formulate 
interestingly and clearly the basic philosophy of personnel work and to 
show its significance in modern industrial society. It should result in 
old-line management seeing its problems in a new light and, if studied 
by beginners, will help create a new generation of personnel administrators 
alive to their responsibilities. The presentation is particularly strong 
in its exposition of the staff function of personnel administration, in its 
guide to the investigation of personnel problems, in its recognition of the 
inter-relationships between technical and human problems and between 
person-centered and policy-centered considerations, and in the need for 
constant appreciation of employee attitudes as a factor in the situation. 
The crucial position of the supervisor in labor-management relations is 
stressed and the effect of unionization on personnel practices is evident in 
the discussions. 

The difficulty in evaluating this volume is that it differs from most 
texts on personnel administration. The sub-title ‘A Point of View and 
a Method” describes it nicely for that is just what it does, i.e., it proposes 
and expounds a frame of reference and a method of approach to the 
understanding and solution of personnel problems in industry. But, by 
its very nature, it stops there. Although it is an excellent How to Go 
About It manual, it is rather weak on What Has Been Done or on How 
To Do It. There is little attempt to survey the “facts,” the “pro- 
cedures,” and the ‘“‘program” in the standard topics of fatigue, rest 
pauses, job analysis, job evaluation, labor force characteristics, measure- 
ment, of employee attitudes, labor laws, employee counseling, personnel 
record keeping, etc. The Selected References are probably intended to 
tell the student where to go for this type of information but, if so, the 
survey is weak in spots, particularly with respect to the contributions of 
industrial psychologists. The references to psychological literature are 
mostly textbooks or articles appearing in the AMA publications; only a 
very few are primary references in psychological journals. 

The greatest weakness is an apparent disregard for the method of 
research in personnel administration. The method of “situational think- 
ing,’ described so well in Section B and illustrated so consistently in the 
remaining sections and case examples, is an excellent guide for the 
handling of the specific case but does not make systematic provision for 
an organized program of basic research. The research approach, ex- 
emplified so well in Yoder’s Personnel Management and Industrial Re- 
lations, is equally important, and, in fact, is necessary to provide the 
background data upon which the method of situational thinking depends 
for its validity. 
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In brief, Pigors and Myers have presented an excellent statement of 
the personnel point of view and a useful guide for applying its funda- 
mental principles to everyday problems in personnel work. To obtain 
a well-rounded background for personnel administrators, the student 
will also need a thorough grounding in research procedures and an ex- 
tensive factual survey of present knowledge in the field, particularly as 
revealed in psychological research. 


Albert S. Thompson 
Tanderbilt University 


Doob, L. W., Public opinion and propaganda. New York: Henry Holt 
and Co., 1948, pp. vii-600, $3.75. 


To prevent possible disappointment any prospective reader should 
understand Doob’s objective. It was not his purpose to review and 
evaluate the relatively quantitative studies that have been conducted. 
The principal purpose appears to be an attempt to explain public opinion 
and propaganda in terms of selected principles of human behavior. 
Obviously this is quite an undertaking. 

In line with this objective, the first group of chapters presents a back- 
ground and explains such concepts as consistency, rationalization, dis- 
placement, compensation, projection, identification, conformity, and 
simplification. . 

This is followed by a short outline of “principles of public opinion.” 
The qualifications which must be attached to the set of principled are 
stated honestly: the concepts which form the basis of the principles are 
merely characteristics and as such are descriptive only; in view of the 
uncertain scientific status of existing principles of behavior from which 
this set of principles has been drawn, it is premature to propose any 
principles of public opinion and propaganda; the proposed principles 
need to be extended and refined; and at this stage, “all that principles 
can accomplish is to call attention to the complexity of the problem and 
to caution as forcefully as possible against premature generalizations 
and glibness”’ (p. 89). 

Analyzing results obtained from studies of public opinion was not a 
principal objective. In fact the ‘exotic or mundane results obtained 
from measuring public opinion’ are used only incidentally ‘‘to indicate 
the difficulties and the techinques of measurement” (p. iii). 

Consequently, the second group of chapters places emphasis on 
methods of conducting public opinion studies rather than on an analysis 
of the results obtained. This naturally leads to the consideration of 
such problems as the nature of the sample; the method of specific assign- 
ment (area-type probability sampling) vs. quota sampling; size of sample; 
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interviewing problems; the technique of questioning; reliability; evalua- 
tion of public opinion polls; and such intensive measures of public opinion 
as panel studies, open-ended interviews, attitude scales, and prolonged 
(intensive) interviewing. 

These chapters evidently represent an attempt to explain the tech- 
niques of sampling and measuring public opinion in such simple terms 
that any one can understand them. The reader who feels the urge to 
point out what might appear to be inadequate treatment of these subjects 
should remind himself of the difficulty of reducing the explanation to 
the simplest possible terms. For example, some readers might object 
to such statements as “public opinion polls usually draw their samples 
not completely at random but at random from within specified strata of 
of the population which have been determined on the basis of attributes 
related to the particular poll in question” (p. 119). Any one familiar 
with the way that the most popular polls actually have been conducted 
might very well question the statement that the selection within strata is 
really random. However, the point is that the contribution provided 
by reducing the explanation to a very elementary level probably more 
than justifies what some readers might regard as inadequate treatment. 

There is one point, however, for which the reviewer cannot find a 
legitimate excuse. In effect, Doob accuses both Gallup and the Psycho- 
logical Corporation of wording questions to get results which will please 
their clients (p. 157). Such a statement is so farfetched that it suggests 
a lack of close practical touch with the ways in which such organizations 
really operate. 

This is merely one example which contributes to the impression that 
in attempting to cover the numerous specialized fields which make up 
the general field of public opinion and propaganda, Doob has been forced 
to rely on reading widely scattered sources rather than depending upon 
practical experience in each of the fields. His treatment of the field of 
advertising provides a good example. Obviously his practical experience 
in this field has been very limited, and his contacts with what has been 
done in advertising research evidently have not been very close. Yet 
he did not hesitate to make such statements as: “They (the radio in- 
dustry) finance polls which purport to show by means of somewhat 
biased questions that people really like to listen to advertising” (p. 491) 
in reference to the Field-Lazarsfeld study. The ‘somewhat biased 
questions” accusation is neither explained nor supported by any evidence. 

The book covers a wide variety of topics in addition to the ones 
already mentioned, including: the importance of public opinion; the 
nature of propaganda; such concepts as stimulus intensity, perceptual 
repetition, perceptual variation, stimulus simplification, reinforcement, 
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drive reduction, and primacy; the media used for propaganda purposes; 
and a final summary on the value of analysis, including an outline to serve 
as a guide in collecting the information needed for a relatively adequate 
analysis. 

What the reader can and cannot learn from the discussion of these 
topics has been suggested previously. In general, the less the reader 
knows, the more he will get out of this book. The beginning student 
will get a relatively quick survey of a wide variety of topics, and the 
reader who is highly specialized in one field will get at least a surface 
understanding of the other fields. However, any one with a fairly good 
grasp of the whole field will find little of interest beyond a few of Doob’s 
personal opinions, and he is likely to feel that the material is fairly thin. 

None of these statements is intended as a criticism of the way Doob 
has approached the difficult problem of covering the whole field of public 
opinion and propaganda in a single book. To attempt to cover every- 
thing from the mechanics of polling to philosophical considerations, with 
a set of principles included, is a very difficult task. The field is made 
up mainly of specialists, with each group working in its own specific field 
and using methods on various levels of accuracy. Coordination is 
needed. Doob has made a pioneering effort to draw together the 
scattered threads. For this reason, his book may interest many readers. 

Alfred C. Welch 


Knox Reeves Advertising, Inc. 
Minneapolis, Minn. 


Rudolph, Harold J., Attention and interest factors in advertising. New 

York: Funk and Wagnalls, 1947. Pp 119. $7.50. 

For many years Daniel Starch and staff have compiled magazine 
readership ratings using the recognition method. The question fre- 
quently has been asked, just what do the Starch studies prove? Mr. 
Rudolph attempts to answer this question, at least in part, and in so doing 
has presented research findings which throw considerable light on the 
relative value of a number of present-day advertising techniques. The 
author states, “the objective of this book was to set forth the elements 
which contribute to the attention and interest of magazine advertisements 
and to determine, as far as possible, the extent of each separate influence.” 

Consumers’ reactions to 2,500 different half and full page advertise- 
ments appearing in The Saturday Evening Post between the years 1935 
through 1939 make up the original data for the studies reported. In 
analyzing these data Mr. Rudolph exhibits an unusual ability to isolate 
and control a surprising number of factors in advertising. If the book 
did nothing more than show how such extremely complex data can be 
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brought under scientific control it would have served a worthwhile 
purpose; but it does more than that. It produces answers to more than 
thirty advertising problems, such as relative value of half and full page 
ads, the best ‘‘spot’’ for a headline, sex preferences for various types of 
illustrations, and the like. 

In certain places one wishes he had isolated a few more factors. For 
example, he shows the effect of “feeling tone’ on attention value in con- 
ventional type advertisements but does not show its influence on reader- 
ship, where one might expect its greatest contribution In other places 
even Mr. Rudolph’s genius for isolation of individual factors is inadequate 
and a considerable number of influences operate in an unknown manner. 
An example of this is his analysis of the problem of ‘‘static’’ vs. “‘action”’ 
pictures. All “static pictures” are lumped together to compare with a 
similar lumping of all ‘‘action pictures’”’ which leaves interest, artistic 
value, pictorial techniques, and a host of other factors unanalyzed. One 
must assume that these unanalyzed factors are equally distributed in 
both types of pictures, which is really a large assumption. 

The statistically trained research worker reading this book will find 
the lack of “‘N’s’” and measures of significance of differences a serious 
shortcoming. The following apology is given by the author: ‘‘unfortu- 
nately, most of the records pertaining to this investigation were destroyed 
when the company (J. Stirling Getchel, Inc.) went out of business. For 
this reason it is not possible to show the number of advertisements in- 
volved in each separate comparison.” It is unfortunate such valuable 
data were destroyed. 

While the author emphasizes the specificity of the problems dealt with, 
it may be well to stress further the fact that the techniques investigated 
are concerned primarily with the mechanical aspects of advertising. If 
one believes mechanical perfection will make a successful advertisement, 
then this book is an extremely important contribution to advertising. 
If, however, one follows Kenneth Goode, H. C. Link, and others who hold 
that mechanical factors, while important, are decidedly secondary to the 
advertiser’s ability to tap deep undercurrents of human motives, then the 
importance of this book is whittled down considerably. 

Howard P. Longstaff 


University of Minnesota 


Selekman, Benjamin M. Labor relations and human relations. Cam- 
bridge, Massachusetts: McGraw-Hill Book Company, 1947, pp. 
xi + 225. 
Another appropriate title for Selekman’s book would be A Psychology 
of Labor Relations and it is indeed a reflection upon psychologists as a 
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group that one of them has not come forward to write a volume on this 
subject. Granted that a sizable body of empirical facts has not been 
developed in this area, it is heartening to see someone strike out and 
prepare an exploration of the human relationships involved in negotiating 
and living under a union agreement. While other authors have discussed 
the subject, this organized treatment is long overdue. 

Selekman paints the current picture of strife in the world of union 
relations and raises the questions of ‘‘why”’ and “‘what can be done about 
it?” “How can we achieve in daily shop behavior the cooperation nec- 
essary for realizing both full production and maximum human satisfac- 
tion?”” His answer traces the emotional reactions of both union and 
management men from the time a union enters the industrial scene as an 
organizing unit, through the negotiation of the first agreement, to the 
problems of administering and modifying the agreement. His plea is 
for greater common understanding of the person across the bargaining 
table as a human being with foibles and feelings, motivations and frustra- 
tions. He insists that “the capacity for conflict and cooperation lies deep 
in the human endowment.” Conflict is today’s pattern because modern 
industrial organization and local shop practices tend to hide the realities 
of interdependence which usually build spontaneous cooperation. ‘The 
discovery of methods for imparting to each man at work the feeling that 
he is an indispensable part of the whole working group thus figures as a 
major problem for research and experiment.”” Selekman faces squarely 
the very real bases for disagreement and suggests means of meeting the 
problems thus arising. It is interesting to relate his proposed practices 
with those found by the National Planning Association in its series, Causes 
of Industrial Peace. 

The psychologist will find many challenging questions; the last 
chapter, “Conflict and Cooperation,”’ presents several hypotheses which 
could serve as effective foundations for research. While he may not 
agree entirely with all of the interpretations, he will be stimulated to do 
some thinking in a much-neglected (by him) field. The book could be 
useful to the industrial psychologist for distribution to his friends in labor 
relations and in unions; they may find it a trifle hard to read but it will be 
well worth the effort. It is a book that the non-industrial psychologist 
will find valuable in understanding the potential role of the psychologist 
in union relations. 


Brent Baxter 
Personnel Department 
The Chesapeake and Ohio Railway Company 
Cleveland, Ohio 





New Books, Monographs, and Pamphlets 


Books, monographs, and pamphlets for listing and possible review should be sent to 
Donald G. Paterson, Editor, Department of Psychology, University 
of Minnesota, Minneapolis 14, Minnesota 


You and your mental abilities. Lorraine Bouthilet and Katharine Mann 
Byrne. Chicago: Science Research Associates, 1948. Pp. 48. $.75 
single copy. $.60 for fifteen or more. $.40 for one hundred or more. 

The psychology of social classes. Richard Centers. Princeton: Princeton 
University Press, 1949. Pp. 256. $3.50. 

The ethics of ambiguity. Simone De Beauvoir. New York: Philosophical 
Library, Inc., 1948. Pp. 163. $3.00. 

The people know best. Morris Ernst and David Loth. Washington, 
D. C.: Public Affairs Press, 1949. Pp. 169. $2.50. 

The psychology of invention in the mathematical field. Revised edition. 
Jacques Hadamard. Princeton: Princeton University Press, 1949. 
Pp. 145. $2.50. 

Elmtown’s youth. A. B. Hollingshead. New York: John Wiley and 
Sons, Inc., 1949. Pp. 420. $3.50. 

Conference guide to basic management training. Arthur 8S. Hotchkiss. 
Deep River, Conn.: National Foremen’s Institute, Inc., 1949. Pp. 
206. $5.50. 

Understanding yourself. William C. Menninger. Chicago: Science Re- 
search Associates, 1948. Pp. 52. $.75 single copy. $.60 for fifteen 
or more. $.40 for one hundred or more. 

Historical introduction to modern psychology. Revised edition. Gardner 
Murphy. New York: Harcourt, Brace and Co., Inc., 1949. Pp. 466. 
Textbook $4.50, Trade $6.00. 

Machine computation of elementary statistics. Katharine Pease. New 
York: Chartwell House, Inc., 1949. Pp. 238. $2.75. 

The pollsters. Public opinion, politics and democratic leadership. Lindsay 
Rogers. New York: Alfred A. Knopf Company, 1949. Pp. 239. 
$2.75. 

Music and medicine. Dorothy M. Schullian and Max Schoen, Editors. 
New York: Henry Schuman, Inc., 1948. Pp. 499. $6.50. 

Intellectual abilities in the adolescent period. David Segel. Bulletin 1948, 
No. 6, Federal Security Agency. Washington, D. C.: Superintendent 
of Documents, U. 5. Government Printing Office, 1948. Pp. 41. 
$.15. 
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How personalities grow. Helen Shacter. Bloomington: McKnight and 
McKnight, 1949. Pp. 256. $3.00. 

Human relationships in public health. Geddes Smith. New York: The 
Commonwealth Fund, 1949. Pp. 18. $.15. 

The American soldier: Vol. 1. Adjustment during army life: Vol. 2. Combat 
and its aftermath. S. A. Stouffer et al. Princeton: Princeton Uni- 
versity Press, 1949. Pp. 600 each. Vol. 1 and 2, $13.50. Separate, 
$7.50. 

Appraisal of vocational fitness by means of psychological tests. Donald E. 
Super. New York: Harper and Brothers, 1949. Pp. 780. $6.00. 
Dynamic psychology. Percival M. Symonds. New York: Appleton- 

Century-Crofts, Inc., 1949. Pp. 413. $3.75. 

Personnel selection. Test and measurement techniques. Robert L. Thorn- 
dike. New York: John Wiley and Sons, Inc., 1949. Pp. 366. $4.00. 

Perspectives in medicine. New York Academy of Medicine. New York: 
Columbia University Press, 1949. Pp. 163. $2.50. 

Research frontiers in human relations. Vol. 92, No. 5 of Proceedings of 
the American Philosophical Society. Philadelphia: American Philo- 
sophical Society, 1948. Pp. 86. $1.00. 

How to prepare an employee’s handbook. Deep River, Conn.: National 
Foremen’s Institute, Inc., 1949. Pp. over 300. $12.50. 

The new cure for white collar unrest. New York: Prentice-Hall, Inc., 
1948. Pp. 47. $1.00. 
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